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LES ISOMORPHISMES EXCEPTIONNELS ENTRE LES 
GROUPES CLASSIQUES FINIS 


JEAN DIEUDONNE 


Introduction'. Les groupes classiques de petite dimension présentent entre 
eux des isomorphismes que l’on peut qualifier de “‘génériques’’, c’est-d-dire 
que ces isomorphismes ne dépendent que de la dimension et du type de 
groupe considéré, et non du corps de base de I’espace vectoriel od opére le 
groupe. Ce sont les isomorphismes entre le groupe unimodulaire A deux 
variables SL2(K), le groupe symplectique Sp.(K) et certains groupes unitaires 
U:*(K,, f), ol Ky est une extension quadratique du corps commutatif K, et 
les isomorphismes entre les groupes orthogonaux a 3, 4, 5 ou 6 variables et 
certains groupes unimodulaires, unitaires ou symplectiques (voir (9) et (6)). 
Pour les groupes classiques finis (auxquels il faut ici ajouter les groupes 
symétriques et alternés), il y a en outre quelques isomorphismes exceptionnels, 
ne rentrant pas dans les types précédents, et connus depuis Jordan (7) et 
Dickson (2); on sait 4 présent qu’il ne peut y en avoir aucun autre (voir (8) 
pour les groupes unimodulaires et les groupes alternés, (4) pour les autres 
groupes). Les méthodes par lesquelles on a démontré jusqu’ici l’existence de 
ces isomorphismes reviennent a considérer les groupes étudiés comme des 
groupes abstraits, qu’on engendre (au moyen de calculs parfois pénibles) par 
des systémes de générateurs et de relations entre ces générateurs, choisis de 
sorte que ces relations soient les mémes pour les groupes dont on veut établir 
l’isomorphie. Ce faisant, on perd a peu prés complétement de vue I'origine 
géométrique des groupes étudiés, et l’existence des isomorphismes que l'on 
obtient apparait comme un pur hasard. Nous nous proposons, dans ce travail, 
de montrer qu’en restant plus prés de la géométrie, on aboutit 4 des démon- 
strations au moins aussi simples et grace auxquelles les résultats apparaissent 
comme un peu plus “‘naturels’’. 


1. Les groupes unimodulaires PSL.(F,) pour g = 2, 3, 4, 5. Les transfor- 
mations du groupe PSL:2(F,) peuvent étre considérées comme des transfor- 
mations homographiques de la droite projective P,(F,), permutant entre eux 
les g + 1 points de cette droite. Si un de ces points est considéré comme point 
4 l’infini, son complémentaire dans P;(F,) peut étre identifié a l’espace vec- 
toriel F, (de dimension 1 sur le corps F,); sig = p* (p premier), une translation 
quelconque dans F, appartient 4 PSL;(F,), laisse invariant le point 4 I’infini, 


Recu le 11 septembre, 1953. 


INous suivons essentiellement, dans ce travail, la terminologie et les notations de (3), (4) 
et (5). Les évaluations des ordres des groupes classiques que nous utilisons sont empruntées a 


(2, p. 309). 
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et, considérée comme permutation des g points de F,, elle se décompose en 
produit de p*' cycles de longueur p. Pour g = 2, on voit ainsi que PSL2(F2) 
contient toutes les transpositions des 3 points de P;(F2), donc est identique au 
groupe symétrique ©;. Pour gq = 3, PSL.(F;) contient tous les cycles (a 5 c) 
du groupe S,, donc contient le groupe alterné %{,, et comme il a le méme ordre 
que %,, il lui est identique. De méme, pour g = 4, PSL2(F,) contient toutes les 
permutations de la forme (a 5) (c d) du groupe A; et comme il a encore méme 
ordre que ce dernier, il lui est identique. 

La démonstration de l’isomorphie de PSL.,(F;) et de Us est un peu moins 
immédiate. On constate alors trés aisément que l’on peut de 5 facons différ- 
entes grouper les 6 points de P; (Fs) en trois couples (a, b), (c, d), (e, f) tels que 
chacun de ces couples soit formé de points conjugués harmoniques par rapport 
a chacun des deux autres couples (l’un des couples (a, 6) peut étre pris arbi- 
trairement, et lorsqu’on fixe a, il y a 5 choix différents possibles pour 3). II est 
clair que toute transformation de PSL,(F;) permute ces cing groupements, 
d’ot une représentation de PSL2(F;) dans Ss; comme PSL,(F;) est simple, 
cette représentation est biunivoque, donc son image est un sous-groupe de Ss 
d’ordre égal 4 celui de PSL2(F;), c’est-a-dire 4 60; comme Y; est le seul sous- 
groupe de ©; d’indice 2, on a établi l’isomorphie de PSL2(F;5) et de Us. 


2. L’isomorphisme entre PSL.2(F;) et PSL;(F:). OnaGL;(F2) = SL3(F:2) 
et le centre de ces deux groupes étant réduit A l’élément neutre, PSL;(F:2) 
s’identifie 4 GL;(F:). D’aprés le théoréme fondamental de la géométrie projec- 
tive, comme F; n’admet que l’automorphisme identique, le groupe PSL;(F:2) 
est aussi le groupe qui permute les 7 points du plan projectif P.,(F:) en trans- 
formant les points alignés en points alignés (et par suite les droites concour- 
antes en droites concourantes). Les involutions du groupe GL;(F:) sont les 
transvections (4, p. 13); chacune d’elles est entiérement déterminée par une 
droite du plan projectif P,(F:) et un point sur cette droite; il y a donc 21 
involutions. En outre, deux transvections permutent si et seulement si le 
point de l’une est sur la droite de l'autre; d’od on conclut aussit6t qu’il y a au 
plus trois involutions de GL;(F2) qui peuvent permuter deux a deux, et qu’il y 
a 14 systémes de telles involutions permutables : 7 de ces systémes correspon- 
dent aux droites de P:(F:) (chacun d’eux étant formé des trois transvections 
dont le point est sur cette droite); deux quelconques d’entre ces systémes 
peuvent étre transformés I’un dans I|’autre par un automorphisme intérieur de 
GL;(F:). Les 7 autres systémes correspondent aux points de P:(F:) (chacun 
d’eux étant formé des trois transvections dont la droite passe par ce point); 
deux quelconques d’entre eux peuvent encore étre transformés l'un dans I’autre 
par un automorphisme intérieur de GL;(F:2); par contre un tel automorphisme 
ne peut transformer un tel systéme en un systéme du premier type. 

Pour déterminer l’isomorphisme entre PSL.2(F7) et PSL;(F2), nous allons 
étudier les involutions de PSL2(F7;). Comme —1 n’est pas un carré dans F;, 
une involution de PSL.2(F;) ne peut provenir que d’une involution de seconde 
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espéce de GL2(F;) (les seules involutions de premiére espéce de GL2(F;) ayant 
pour déterminant —1); étant donné un vecteur a ~ 0 du plan F,’, une telle 
involution u est entiérement déterminée par la donnée de b = u(a) pourvu 
que 6 ne soit pas collinéaire avec a; en effet, a et b forment alors une base de 
F;*, et comme on doit avoir u(b) = —a, u est bien déterminée. On notera 
d’ailleurs que u et —u donnent la méme involution dans PSL2(F;), si bien 
qu’en prenant une base (a, 5) de F;*, on obtient toutes les involutions distinctes 
de PSL,(F;) en partant des involutions de seconde espéce u de GL2(F;) telles 
que u(a) = Aa + wb, od A parcourt les éléments de F;, que nous noterons 0, 
+1, +2, +3, et od uw prend seulement les valeurs 1, 2, 3; on trouve bien ainsi 
21 involutions. Soit u%» celle de ces involutions telle que \ = 0, u = 1 et cher- 
chons les involutions u qui donneront dans PSL2(F;) une involution permutant 
avec celle qui provient de uo: il faut et il suffit pour cela que “ou soit encore une 
involution de seconde espéce de GL2(F;). On voit aussit6t que, par rapport a 
la base formée de a et de b = uo(a), la matrice de u est 


2 

, i+ 
yu 

7 — 2X 


si u(a) = }\a + wb. En écrivant que la matrice de uou est de la forme pré- 
cédente, on obtient la condition A? + yw? + 1 = 0, ce qui donne les 4 involu- 
tions distinctes de PSL2(F7), correspondant aux valeurs suivantes de \ et yu: 


A=+3, w= 2; 
r +2, wp=3. 


Si on cherche celles de ces quatre involutions qui commutent on constate que 
l’involution ’,, correspondant au couple (3, 2), permute avec u's, correspon- 
dant a (—2, 3), et que w”’;, correspondant a (—3, 2), permute avec u’’>s, corre- 
spondant a (2, 3); il n’y a pas d’autre couple d’involutions permutables parmi 
ces quatre involutions. 

Soit maintenant ¢ une transvection de vecteur a, telle que ¢(a) = a, 
t(b) = b + a; les puissances successives ?¢, #*,..., ¢® sont toutes les transvec- 
tions distinctes de vecteur a. Désignons par P, (1 < k < 7) le systéme des 
trois involutions deux 4 deux permutables 


k—1 —k+1 k—1 —k+1 k—1 —k+1 
t uot _, © we , fuse 


et par D, (1 < k < 7) le systéme des trois involutions deux 4 deux permutables 


k—1 —k+1 k—-1 —k+1 k—1 —k+1 
ar fa. Fa. 


U2 
Comme fut~ est l’involution qui transforme a en /*(6), on constate aisément 
que les 14 systémes P,, D, contiennent les 21 involutions distinctes de PSL2(F;). 
Deux P; (resp. D,) d’indices distincts n’ont aucune involution en commun, et 
chaque involution de PSL,(F;) appartient 4 un seul P; et un seul D,. Si on dit 
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que deux systémes sont incidents s’ils ont une involution en commun, on con- 
state aisément que les relations d’incidence entre les P; et les D, sont exacte- 
ment celles des 7 points et des 7 droites du plan projectif P2(F2). Cela étant, 
une transformation s de PSL2(F;) permute les 21 involutions de PSL2(F;) (par 
l’automorphisme intérieur u — sus~'), transforme évidemment un P, (resp. D,) 
en un P, ou un D,, et conserve les “relations d’incidence’’. On en déduit une 
représentation @ de PSL2(F;) dans le groupe I des permutations des 14 points 
et droites de P2(F:), conservant les relations d’incidence. Mais le théoréme 
fondamental de la géométrie projective montre que I est un groupe ayant 
comme sous-groupe distingué d’indice 2 le groupe PSL;(F:2), qui est lui-méme 
simple; I n’a donc pas d’autre sous-groupe d’indice 2. D’autre part, comme 
PSL.(F;) est simple, ¢ est un isomorphisme sur un sous-groupe de I qui, 
étant d’ordre égal a celui de PSL.2(F;), est nécessairement d’indice 2 dans I, 
donc identique 4 PSL;(F:2). 


3. L’isomorphisme entre PSL2(F;) et Xs. Nous allons ici encore partir des 
involutions de PSL2(F;). Comme —1 est un carré dans Fy (nous désignerons 
par 7 un élément de Fy tel que i? = —1), les involutions de PSL2(F3) provien- 
nent des transformations u de SL2(F,) définies de la facon suivante: l’espace 
vectoriel Fy? est décomposé en somme directe de deux droites D+ et D~-, et on 
a u(x) = ix dans Dt, u(x) = —ix dans D-. Il n’y a pas d’autre involution dans 
PSL;,(F3), car les involutions de premiére espéce de GL.2(Fs) ont pour déter- 
minant —1 et les autres involutions de seconde espéce ont un déterminant 
#1. Une involution de PSL:2(Fs) est donc entiérement déterminée par un 
couple de points distincts de la droite projective P,(F,), correspondant aux 
droites D+ et D-; nous identifierons P;(F,) avec la droite x = 1 dans l’espace 
vectoriel Fy’, complétée par le point a l’infini ©. 

Il est immédiat que pour que deux involutions de PSL.2(F,) permutent, il 
faut et il suffit que les couples de points de P;(Fs) qui leur correspondent soient 
conjugués harmoniques. On constate alors aussit6t que pour une involution 
donnée, il existe deux triplets d’involutions deux 4 deux permutables, conten- 
ant l’involution donnée. Si cette involution correspond au couple (0, ~), les 
deux triplets correspondent respectivement aux couples suivants: 


T; : (0, ©), (1, —1), (i, —1), 
y ade (0,7), (l+i,-1-%), (l1-i,-1+%. 


On constate en outre qu’il existe une transformation de PGL,(F,) transfor- 
mant les couples de 7; en les couples de T’;, mais cette transformation (qui 
est l"homothétie de centre 0 et de rapport 1 + 7) n’appartient pas A PSL2(Fs). 
Cela étant, comme il existe toujours une transformation de PSL2(Fs) envoyant 
un couple quelconque de points de la droite projective P;(F,) sur un autre, 
on voit que les 45 couples de points de P;(Fs) se répartissent en 15 triplets 
transformés de 7T;, tout couple appartenant A un triplet et un seul, la répar- 
tition étant la suivante: 
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T:: (1, @), (—1,0), (1 + 72,1 — 2) 
Zs 2 (—1, ~) (0, 1), (—1 +i, —1 — 7) 
ze 3 (1, ©), (1+%,-1+7), (-z,0) 

Ts : (—1, ©), (l1-—iz,-1— 72), (0,2) 

ye: (1 +14, @) (—1 + i, 4), (1 — i, 1) 

T: : (-l1—t,0) (-—2z,1—-— 2), (—1, -—1 +1) 
re 2 (1 — 4, @), (—1 — i, —7), (1, 1 + 2) 

is : (—1+%1,0), (4,1+2), (—1 — i, —1) 
T 10 (1, 2), (—1, —2), (1 — i, —1 +72) 
Ti: (-—1, —1), (1, 2), (1 +2, -—1 — 2) 
T12: (0,1 + 2), (—1,1— 42), (—i, —1 +14) 
T13 : (0, —1 — 12), (1, -1+ 2), (74,1 — 7) 

Ti4: (0,1 — 2) (—1,1+ 2), (i, —1 — Zz) 
Tis: (0, -1+9, (1, —1 — 24), (—i, 1+ 2) 


(les triplets d’indices 2k et 2k + 1 se déduisent l'un de l’autre par symétrie de 
centre Q). 

En multipliant les points de chaque couple par 1 + 7, on obtient une autre 
répartition en 15 triplets T’,,..., T’1s, transformés de 7’; par les transforma- 
tions de PSL2( Fy). Remarquons maintenant que si, pour deux triplets T,, T,, 
il existe un couple de 7; qui permute avec un couple de T,, ces deux couples 
appartiennent 4 un méme triplet 7’,. On constate alors sans peine que le 
triplet 7; appartient 4 deux systémes de triplets 


Ti, Ts, T;, Tus, T15 
Ti, Ts, T», Ti2, Tis 


tels que deux couples appartenant a deux triplets distincts d'un méme systéme 
ne correspondent jamais 4 deux involutions permutables; en outre, ces sys- 
témes de 5 triplets sont maximaux pour la propriété précédente. 

Cela étant, vu le fait que les triplets T,; sont permutés transitivement par 
PSL.(F;), il existe 6 systémes maximaux de 5 triplets JT, du type précédent, 
chaque triplet appartenant exactement a deux de ces systémes.? Comme ces 6 
systémes sont évidemment permutés par PSL2(F3), on voit qu’on obtient une 
représentation @ de PSL.,(F;) dans Ss. Comme PSL2(F3) est simple et de 
méme ordre que Ws, et que Ws est le seul sous-groupe de Ss d’indice 2, ¢ est un 
isomorphisme de PSL2(F3) sur Us. 


4. L’isomorphisme entre PSL,(F2) et Us. Le groupe SL4(F2) étant égal 
a GL,(F:) et ayant un centre réduit a l’élément neutre, peut étre identifié a 
PSL,(F:). Nous allons étudier comment GL,(F.) permute les bivecteurs (1) 
sur l’espace E = F;*; si (€;)1<:<4 est une base de cet espace, les 64 bivecteurs 
sur E sont donnés par la formule 


Z= €12€1 A C2 + €13€1 A C3 + E11 A C4 + €22€2 A C3 + €2a€2 A C4 + €nn€s A C4 


*Ces systtmes apparaissent aussi dans les études sur le “groupe de Valentiner” formé de 
transformations projectives du plan projectif complexe, qui est isomorphe a YW, (voir (10)). 
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ou chacun des ¢,; prend l'une des valeurs 0, 1. Les bivecteurs de rang 4 sont 
ceux pour lesquels le pfaffien 


(1) €12€34 + €13€24 + €14€23 = l. 


On voit aussit6t qu’il existe 28 de ces bivecteurs; d’ailleurs, comme chacun 
d’eux peut s’écrire sous la forme a A 6 + ¢ A d, ot (a, b, c,d) est une base de 
E, deux quelconques d’entre eux peuvent se transformer l'un dans l'autre par 
une transformation de GL,(F2). 

Considérons un des bivecteurs précédents, par exemple le bivecteur 29 dont 
toutes les composantes ¢,;, = 1, et proposons nous de chercher les systémes 
maximaux de bivecteurs de rang 4, contenant zo, et tels que pour deux bivec- 
teurs quelconques de ce systéme, on ait z A z’ = 0, c’est-a-dire 


(2) €12€54 + €12€34 + €13€24 + €i2€24 + €14€23 + €14€23 = 1. 


En remplagant 2’ par zo dans cette relation, on voit que les bivecteurs z d’un 
des systémes cherchés doivent étre tels que }>e,, = 1; compte tenu de (1), on 
voit que l’on doit avoir exactement trois e,,; égaux a 1, dont deux forment un 
couple “‘opposé”’, c’est-a-dire tel que les indices doubles (7, 7) dans un tel couple 
n’aient pas d’indice simple commun (i.e., les couples (€12, €34), (€13, €24), 
(€14, €23)). Si alors on tient compte de (2), il est facile de vérifier qu'il y a 
exactement deux systémes maximaux du type cherché contenant 20, savoir le 
systéme formé de zo et des 6 bivecteurs: 


(ot les coordonnées du bivecteur sont rangées dans I|’ordre qu’ils ont dans la 
relation (1)), et le systéme formé de 2» et des 6 bivecteurs: 


(1, 1, 1, 0, 0, 0) 
(1, 1, 0, 1, 0, 0) 
(0, 0, 1, 1, 1, 0) 
(0, 0, 1, 1, 0, 1) 
(1, 0, 0, 0, 1, 1) 
(0, 1, 0, 0, 1, 1) 


Comme les bivecteurs de rang 4 sont permutés transitivement par GL,(F,), il 
existe 8 systémes maximaux de 7 bivecteurs du type précédent, chaque bi- 
vecteur appartenant exactement a deux de ces systémes. Comme ces 8 sys- 
témes sont évidemment permutés par GL,(F:2), on obtient une représentation 
de GL,(F:) dans Ss. Comme au n°3, on en conclut que cette représentation 
est un isomorphisme de GL,(F2) sur Ws. 
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5. L’isomorphisme entre Sp,(F2) et Ss. Les résultats du §4 montrent 
aussit6t que le groupe symplectique Sp,(F:) est isomorphe A Gs. En effet, 
Sp.(F2) peut étre considéré comme le sous-groupe de GL,(F;) laissant invariant 
un bivecteur zo de rang 4; comme 2p est l’intersection de deux systémes maxi- 
maux, les transformations de Sp,(F:) sont celles qui laissent invariants ces 
deux systémes ou qui les échangent. Mais il est immédiat que le sous-groupe de 
%s consistant en les permutations qui laissent fixes ou échangent deux des 8 
objets permutés par Ys; est isomorphe A Ss. 

On peut obtenir le méme résultat par un raisonnement plus “‘intrinséque”’, 
dont nous nous bornerons a indiquer les grandes lignes. On voit aisément que 
chacun des 15 vecteurs # 0 de F,‘ appartient 4 deux systémes maximaux de 5 
vecteurs dont deux quelconques ne sont pas conjugués pour la forme alternée 
(x, y) qui définit le groupe symplectique Sp,(F:). Par exemple, si on prend une 
base symplectique (¢1, é2, €3,¢4) de F,* telle que (e;,¢:) = 1, (és, e4) = 1, 
(e;,é€;) = 0 pour les autres couples d’indices, on voit que les deux systémes 
maximaux de 5 vecteurs qui contiennent e,; sont 


€1, €2 Cr: te2tes, Crt e2+ es, C1 + C2 + e3 + &% 


et 


€1, C1 + eo, C2 +3, C2 +4, C2 + C3 + &%. 


Comme Sp4(F2) permute transitivement les 15 vecteurs ~0 de F;,', il y a 6 
systémes maximaux du type précédent, chaque vecteur appartenant exacte- 
ment 4 deux tels systémes. Comme la seule transformation symplectique qui 
laisse invariants tous les vecteurs ~0 est l’identité, on voit qu’on obtient de la 
sorte un isomorphisme de Sp,(F:) dans le groupe symétrique Gz; les deux 
groupes ayant méme ordre, la conclusion en résulte. 


6. La structure du groupe U;(F,). Le corps F, s’obtient par adjonction 
a F, de la racine w de |l’équation quadratique w* + w + 1 = 0, l'autre racine 
étant w + 1 = w! = w’; le groupe des éléments de F, de norme | est formé des 
3 racines cubiques de I’unité 1, w et w?. Ce groupe est donc isomorphe au centre 
Z; de U; et au groupe quotient U;/U*;. Nous allons étudier le sous-groupe 
distingué 7; de U*+; engendré par les transvections unitaires (5). 

Remarquons d’abord que si @ est un vecteur non isotrope de E = F;,’, le 
plan P orthogonal a la droite aF, contient deux droites orthogonales non iso- 
tropes bF,, cF, et trois droites isotropes; les plans Q et R orthogonaux a bF;, et 
cF, contiennent de méme chacun trois droites isotropes. Les 9 droites restantes 
de E sont non isotropes: il y en a trois dans chaque plan (isotrope) passant par 
exemple par aF, et par un vecteur isotrope e du plan P; elles sont définies par 
exemple par les vecteurs ¢ + a, e + wa, e+ w*a. On voit ainsi que les 12 
droites non isotropes de E se répartissent en 4 triédres trirectangles. 

Cela étant, une transvection unitaire de vecteur (nécessairement isotrope) 
e € P par exemple, laisse invariant le vecteur a, et permute les deux droites 
orthogonales DF, et cF,. On voit donc que le groupe 7; permute entre elles les 
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trois droites aF,, bF,, cF, de toutes les facons possibles. I] contient par suite un 
sous-groupe distingué 7’; formé des transformations de 7; qui laissent invari- 
ante chacune des droites aF,, bF 4, cF4, et il est clair que T7;/T"; est isomorphe 
au groupe symétrique ©;. Remarquons d’autre part que la matrice d'une 
transformation de T’; par rapport a la base (a, b, c) est une matrice diagonale 


iy 


ol a, 8, y sont pris parmi les racines cubiques de Il’unité 1, w et w?; comme le 
déterminant doit étre 1, on voit que a, 8, y sont, ou tous égaux, ou tous dis- 
tincts; le premier cas correspond aux matrices du centre Z;, et il reste 6 autres 
matrices, de sorte que 7’;/Z; est cyclique d’ordre 3. Enfin, comme U; est 
d’ordre 216, U;/T; est d’ordre 12; on constate aussit6t qu’une quasi-symétrie 
de rapport i par rapport au plan P permute circulairement les trois tri¢dres 
trirectangles distincts de (aF,, bF;, cF,). Le groupe U;/T; permute donc les 
quatre triédres trirectangles formés avec les droites non isotropes, et contient 
tous les 3-cycles sur ces quatre objets; il contient donc le groupe alterné Y,, 
et comme il a méme ordre, il lui est isomorphe. 
Finalement, nous avons pour U; une suite de composition 


U;> Us; DT; DT3D Z3D {1} 


ot les quotients successifs sont cycliques d’ordre 3, sauf U*+;/T; isomorphe au 
“Vierergruppe”’ et T;/T’; isomorphe a G3. 


7. L’isomorphisme entre PU+,(F,) et PSp.(F;). Conservant les notations 
du §6, soit (a, 5, c, d) une base orthogonale de F,*; nous avons vu que le sous- 
espace H, de dimension 3 engendré par a, b, c contient 12 droites non isotropes, 
et 9 droites isotropes. D’autre part, un plan passant par d et par une droite non 
isotrope xF, de H, ne contient que des droites isotropes en dehors de dF, et 
xF,; les droites non isotropes autres que dF, et non contenues dans H, sont 
donc contenues dans les plans passant par d et une des droites isotropes de H,, 
et chacun de ces plans en contient 3; on obtient donc finalement 40 droites 
non isotropes dans F,‘ (comme il y a en tout 255/3 = 85 droites dans F,', il y 
a 45 droites isotropes dans cet espace). 

Remarquons d’autre part qu’il existe exactement 40 droites dans F;*. Nous 
allons voir qu’on peut définir une correspondance biunivoque entre les droites non 
isotropes de F,‘ et les droites de F;*, de sorte qu’A deux droites orthogonales 
correspondent deux droites orthogonales. Cela permettra d’établir l’isomor- 
phisme entre PU*+,(F,) et PSp4(F;) de la facon suivante: une transformation 
ui de PU*,(F,) permute les droites non isotropes et transforme deux droites 
orthogonales en droites orthogonales; il lui correspond donc une permutation 
a’ des droites de F;‘ conservant l’orthogonalité, et par suite transformant tout 
hyperplan en hyperplan. En vertu du théoréme fondamental de la géométrie 


rs 
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projective (et comme F; n'a pas d’automorphisme non identique) @ provient 
d’une transformation linéaire u’ de F;* qui conserve |’orthogonalité; on sait 
(4, p. 31) qu'une telle transformation est telle que (u’ (x), u’(y)) = +(x, y) od 
(x, y) est la forme alternée définissant le groupe symplectique. Or ces transfor- 
mations forment un groupe dans lequel Sp,(F;) est un sous-groupe invariant 
d’indice 2. On voit donc qu’on obtient un isomorphisme ¢ de PU*,(F,) dans 
un groupe I’, dont le groupe PSp,(F;) est sous-groupe invariant d’indice 2. 
Comme PSp,(F;) est simple, T n’a pas d’autre sous-groupe d’indice 2; mais les 
ordres de PU*,(F4) et de PSp,4(F;) étant tous deux égaux a 25920, ces deux 
groupes sont isomorphes. 

Reste donc a décrire la correspondance en question; nous le ferons en 
associant 40 vecteurs sur les droites non isotropes de F,‘ a 40 vecteurs sur les 
droites de F;*. Observons que 4 droites deux A deux orthogonales de F,‘ 
doivent correspondre a 4 droites situées dans un méme plan totalement isotrope 
de F;‘. D’autre part, les 4 droites non isotropes d’un méme plan isotrope de F*, 
doivent correspondre aux 4 droites d’un méme plan non isotrope de F;*: 
considérons en effet un vecteur isotrope e dans le plan défini par aF, et bF;; 
les plans P,, P2 définis par e et c, d respectivement sont alors isotropes, et 
tout vecteur de P; est orthogonal 4 tout vecteur de P2, deux vecteurs non 
colinéaires et non isotropes situés dans l'un de ces plans n’étant jamais ortho- 
gonaux; si notre correspondance existe, 4 deux droites non isotropes distinctes 
de P: correspondent deux droites non orthogonales dans F;‘, qui engendrent 
donc un plan non isotrope Q2, et aux 4 droites non isotropes de P,; doivent 
correspondre 4 droites orthogonales 4 Qz, c’est-a-dire les 4 droites du plan non 
isotrope Q; orthogonal a Qs». 

Soit alors (e;, é2, és, €4) une base symplectique de F;*‘, telle que 


(€1, €3) = (€2, &4) = 1 


et (e,, €;) = 0 pour tout autre couple d’indices. Aux 4 vecteurs a, b, c, d de la 
base choisie dans F,‘, nous ferons correspondre 4 vecteurs d'un plan isotrope 
suivant la régle: 

a b c d 

ei €2 €1 — &2 €1 + é2 


Si on remarque que les 4 vecteursb +c+d,c+d+a,d+a+ba+b+c 
forment une base orthogonale de F,* of chaque vecteur est orthogonal au 
vecteur de méme place dans (a, b, c, d), on est amené a leur faire correspondre 
dans F;‘ 4 vecteurs suivant la régle: 


a+b+c b+c+d a+c+d a+be4+d 


€3 — & e4 3 €3 + & 


On trouve alors aisément, en tenant compte des remarques faites ci-dessus, 
que la correspondance suivante pour les vecteurs non isotropes de l’hyperplan 
H, respecte l’orthogonalité et transforme les vecteurs d’un plan isotrope en les 
vecteurs d’un plan non isotrope: 
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Hz a é1 a+b+c €3 — & 
b eo a+ wh + wc €; + €2 — €3 + 
c | ae a+ w*b + we €; + €2 + €3 — & 
at+wh+we | ete—e at+wb+wrei | e—es+e 
a+b6b+ wc €; — C2 — 3 + & a+b+ we €; — Co + e3 — 4 
a+wb+c | @:—eés +e a+wb+c €2 + €3 — 4 


On opére de méme dans les trois hyperplans H,, H,, H, orthogonaux respec- 
tivement a c, b, a, ce qui donne la correspondance: 














H, a | é1 a+b-+d €3 + & 

b €s a+wh+wd | e—e:—e3— &% 

d | ete a+wb+oad | e;: — @2t+e3 + & 
a+ wh + wd | €: + €3 + es a+ wb + wd €1 — €3 — & 
at+b+o%d | extee—es:— & a+b+ ad €1 + €2 + C3 + &% 
a+ob+d | €2 + e3 + & a+wobd+d €s — €3 — &4 
H, asi éq a+c+d ii és 

c | €1 — €2 a+ we + wd €2 — €3 

d | ates a+ we + wd €2 + e3 
a+ oc + wd | €1 + é€3 a+ we + wd €1 — €3 
a+c+t wd €1 + €2 + €3 atctwd | ete—es 
a+we+d €1 — €2 + 3 at+ect+d | e— 2 — é3 
H, b | e b+c+d | e& 

c €1 — €2 b+wet+wd | a—e 

d €1 + e2 b+we+ad | ates 
b+ we + ad €o + 4 b+wet+wd | e—e 
b+c+w”d €; + €2 + &% b+ec+tud | eter—e% 
b+wet+td | e—e— & b+actd | e—ee +e 


Il reste 4 vérifier que deux vecteurs orthogonaux x, y, appartenant 4 deux 
hyperplans distincts, par exemple H, et H,, et orthogonaux, correspondent a 
deux vecteurs orthogonaux de F;‘. Or, Il"hypothése entraine que le plan (iso- 
trope) passant par c et x et le plan (isotrope) passant par d et y contiennent 
la méme droite isotrope, située dans le plan défini par a et b. II suffit donc de 
vérifier que pour tout vecteur isotrope e dans le plan défini par a et b, un 
vecteur du plan passant par c et e, et um vecteur du plan passant par d et e 
correspondent a des vecteurs orthogonaux de F;‘ (puisqu’on sait déja qu’il en 
est ainsi pour c et d, et que tous les vecteurs d’un plan isotrope contenu dans 
un des hyperplans H,, H,, H,, Hg correspondent aux vecteurs d’un plan non 
isotrope dans F;*). Cette vérification (qu’il faut faire pour les vecteurs iso- 
tropes des 6 plans passant par deux des vecteurs a, b, c, d) est trés facile, et 
achéve de démontrer Il’isomorphisme entre PU*,(F,4) et PSpa(Fs). 
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Remarques. (1) Le fait qu’il y ait 40 droites non isotropes et 45 droites iso- 
tropes dans F,‘ donne aussit6t les représentations connues de PU*,(F,) 
comme groupe de permutations de 40 ou de 45 objets (2, p. 307). On voit 
aisément que par toute droite isotrope de F,* passent 3 plans totalement iso- 
tropes : si la droite donnée D est dans le plan défini par a et b par exemple, ces 
trois plans sont ceux passant par D et par les 3 droites isotropes du plan 
défini par c et d. Comme chaque plan totalement isotrope contient 5 droites 
isotropes, il y a en tout 3.45/5 = 27 plans totalement isotropes, et cela donne 
la représentation de PU*+,(F;) comme groupe de permutations de 27 objets 
(2). 

(2) Dans la démonstration de l’isomorphie entre PU*,(F,) et PSps(Fs), 
nous avons utilisé la simplicité du second groupe mais non celle du premier. 
Notre démonstration peut donc servir 4 prouver que PU*,(F,) est simple; il 
est aisé, A partir de 1A, de prouver que PU+,(F,) est simple pour m > 4: en 
effet, deux vecteurs non isotropes quelconques x, y dans F;" peuvent toujours 
étre plongés dans un sous-espace non isotrope V de dimension 4, et par suite 
si (x, x) = (y, y), il existe toujours une rotation unitaire dans V qui trans- 
forme x en y. Comme dans V toute rotation est produit de rotations hyper- 
boliques, en vertu de la simplicité de PU*+,(F,), on en conclut sans peine, par 
récurrence sur m, que toute rotation dans U+,(F;,) est un produit de rotations 
hyperboliques (3, pp. 66-68). Cela fait, la simplicité de PU*,(F,) permet de 
démontrer que tout sous-groupe distingué de U*+,(F,) non contenu dans le 
centre, contient une rotation hyperbolique (3, pp. 70-71) et par suite est égal 
a Ut, (F,). 
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THE HOOK GRAPHS OF THE SYMMETRIC GROUP 
J. S. FRAME, G. pe B. ROBINSON, anp R. M. THRALL 


1. The hook graph. Each irreducible representation [\] of the symmetric 
group S, may be identified by a partition [A] of m into non-negative integral 
parts A; > Az > ... A, > O, of which the first \’, parts are >j, or by a right 
(Young) diagram also called [A], that contains \,; nodes in its ith row and X’, 
nodes in its jth column. An interchange of rows and columns in the diagram 
[A] converts it to the associated diagram [\’] belonging to the associated repre- 
sentation [\’] of the same degree f). 

The node in the ith row and jth column of [A] is called its ij-node. It is called 
the corner of the ij right hook (7) that consists of this node and all nodes to the 
right of it or below it. The A; — 7 nodes on the right (in the 7th row) are called 
the arm of the 7j right hook, and the right end node is called the head. The 
d’; — i nodes below the ij node (in the jth column) are called the leg of the 
right hook, and the bottom node is called the foot. The total hook length h,, is 


1.1 hy = 1+ (A; -j) + (A’; — 4). 


The hook graph HA] belonging to [A] is an array of positive integers ob- 
tained by placing each of the m hook lengths h,, at the corresponding ij-node 
of the diagram [A]. The hook product HA is the product of the n integers h,, 
in the hook graph. 

The 7j-node is called a g-node if and only if h,, is divisible by g. It is called a 
q-node of residue 7, or simply a (gq, r)-node, if the integers \, and \’, satisfy the 
congruences 
1.2 A; -t+1=j-—N;=r (modgq), l<gr¢g. 
Clearly a (q, r)-node is also a g-node. 

In this paper we present several properties of the representation [A] that can 
be stated and proved more simply than heretofore by using the hook graph and 
related concepts. 

Following a preliminary lemma about the hook numbers associated with the 
nodes of any right hook, we prove in §2 that the degree f, of the representation 
[A] is equal to the group order ! divided by the hook product H) if the diagram 
[A] is either a right diagram or a direct sum of disjoint right diagrams. We also 
characterize irreducible representations [A] of defect 0 (mod p) by the absence 
of multiples of p in the hook graph. 

In §3 we show that the simply constructed g-quotient diagram [A], obtained 
by deleting all nodes of [A] except g-nodes is the same except for rearrangement 
of disjoint constituents as the star diagram of Robinson, Staal and others 
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(1; 8; 9; 10; 11; 12), and we give simple proofs of Staal’s Theorem B concerning 
the removal of kq-hooks from [A]. The relationship of Littlewood’s p-quotient 
(5) to Robinson’s star diagram has been discussed by Farahat (2). 

For each integer r from 1 to g the (q, r)-nodes of [A] (if such exist) are shown 
in §4 to form one of the disjoint constituents of the (rearranged) star diagram. 
From this follows a short proof of Staal’s Theorem C. 

In §5 we give a short hook graph proof of Staal’s Theorem A concerning the 
exponent of p in the degree of [A] and in §6 we describe a constructive method 
for determining the g-core from the hook graph without actually removing 
hooks. Finally, in §7 we show how the leg length of a removable hook is deter- 
mined from the hook-graph. 


2. The degree of the representation [\]. The st-node is called a rim node of 
[A] if there is no node of [A] in the s + 1, ¢ + 1 position. Counting along the 
rim from the head of the right hook with corner at the ij-node, (located at the 
right of the ith row) to its foot at the bottom of the jth column, there are /h,, 
rim nodes forming what we call the ij skew or rim hook. Consider the two 
pieces of the ij rim hook obtained by cutting it between the mth and (m + 1)th 
node, counting from the head. If these nodes are in the same row, and a vertical 
cut is made between the ¢th and (¢ — 1)th columns, the upper right part ends 
in a foot and is a rim hook of length h,,, but the lower left part does not start 
with a head node and is not a rim hook. If, however, the two nodes are in the 
same column, and a horizontal cut is made between the (s — 1)th and sth 
rows, the lower left part starts with a head node and forms a rim hook of length 
h,,, whereas the upper right part with 4,, — h,,; nodes does not end in a foot 
and does not form a rim hook. As m varies from 1 to /,,, these lengths m of the 
upper right parts assume as values either h,, (¢ > 7), or hyy — hy; (s > 7) but 
not both. Thus we establish the lemma 


LemMA Il. If hy, (7 < t < Ay) and hy, (i < s < Xd’; are the hy, integers in the 
ij-right hook of the hook graph H{)\, then the integers h,, and h,, — h,, are distinct 
and form a permutation of the integers 1, 2, .. . his. 


It is clear from Lemma | that the product of all hook lengths in the ith row 
of H[A] is given by the formula 


2.1 P, = ea)! / TT (ha _— hs). 


Now the first column hook lengths hw = A; — 7 + Xd‘; are precisely the 
numbers /, that appear in the Frobenius formula (3; 4; 6) for the degree /, of 
[A], namely: 


2.2 f= nT] ail (l, — 1,). 


i)! s>t 


This formula was discovered independently by A. Young (14). 
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Setting /; = 4 and substituting from 2.1 in 2.2 we obtain 
2.3 fy = n'!|] (1/P,). 


The product of all the P; is the complete hook product H,. Thus we have 
proved our first main theorem. 


THEOREM 1. Let H) be the product of the hook numbers h,, in the hook graph of 
an irreducible representation |] of the symmetric group S,. Then the degree fy of 
[A] is given by the formula 
2.4 fx = n!/Ay. 


Example 1. The irreducible representation [6,4,2] of Si: has the following 
right diagram [A] and hook graph H[A]. 


2.5 a oe Hi): 8 75421 
S @ 2 J 
2 1 
Its degree is computed as follows: 
_ (12)! _ 12-11-10-9-6-3 ata 
2.6 fie.4.21 a sacle. 11-3° = 2673. 


CoroLiary 1. Let H) be the product of the hook numbers h,, in the hook graph 
of the reducible representation [d|] of S, that corresponds to a diagram consisting of 
a number of disjoint right diagrams having b, nodes in the rth constituent. Then the 
degree of [A] is given by the same formula 2.4. 


Proof. The degree f, is the number of standard orderings of the n nodes of 
the diagram such that the numbers increase from left to right within any row, 
and from top to bottom within any column (9). There are m!/II(b,!) ways in 
which the numbers 1 to m can be assigned to the various constituents, and by 
Theorem 1 there are 5,!/H,,, ways of ordering them within the rth constituent, 
if H,., is the hook product for the rth constituent. Hence 


n! b,! n. n! 


f= opi tia. = Tm. * mh 


where the hook product H) for [A] is the product of the hook products of its 
constituents. 

Another simple consequence of Theorem 1 is the following known result 
about p-hooks for any prime ~. An ordinary irreducible representation of a 
group G is said to be of defect 0 (mod p) when its degree is divisible by the 
highest power of p that divides the group order (1). Such a representation is an 
indecomposable and irreducible modular component of the regular representa- 
tion and its character vanishes for p-singular classes. For the symmetric group 


to 
~ 
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S, such representations of defect 0 are found by inspection of the hook graph as 
follows. 


CoROLLARY 2. If p is any prime, an ordinary irreducible representation |\| is 
of defect 0 (mod p) if and only if its hook graph contains no multiple of p. 


In Example 1 above we see that [6,4,2] is of defect 0 (mod 3) and (mod 11). 
However it is of maximum defect for p = 2, 5, or 7. 


3. The hook graph of the g-quotient (or star) diagram. Given a diagram [A] 
whose nodes include b g-nodes, we define the g-quotient diagram [A], to be the 
diagram of b nodes obtained by deleting all the nodes of [A] except the g-nodes. 


LemMaA 2. If the hook number of the ij-node in |] is h,, = kq, the hook number 
of the corresponding node in |d], is k = hy;/Q. 


Proof. In Lemma 1 it was proved that each number from 1 to /,, occurs 
exactly once among the integers h,,(t > j) and hy, — h,,, (s > i). Thus if 
hy; = kq it follows that exactly k multiples of g appear among the numbers 
hy, (t > 7) and kq — h,,; (s > 2). Hence exactly k of the hook numbers h,, 
(t > 7) and h,, (s > 7) are divisible by g, and there are k different g-nodes in 
the 7j right hook of [A]. Only these k nodes are retained in the corresponding 
right hook of [A],, so the corresponding hook number is k. 


Thus we obtain the hook graph H[A], of the g-quotient if we divide each 
hook number in HA] by g and retain only the integers. Lemma 2 shows that 
our easily constructed g-quotient diagram is equivalent to the star diagram of 
Robinson and Staal whose existence is proved in Staal’s Theorem B, which he 
stated as follows: 


STAAL’s THEOREM B. Given the right diagram i, and a positive integer q, there 
exists a diagram * (called the ‘‘star diagram” of d) such that there is a one-to-one 
correspondence between the kq-hooks of \ and the k-hooks of d*. 


Staal’s diagram A* and our diagram [A], differ at most in the rearrangement 
of disjoint constituents, but the order of rows and columns within each con- 
stituent is the same for A* and [A],. 

We next give a simpler proof of Staal’s Theorem B, applied to [A], and 
rephrased in our notation. 


STAAL’s THEOREM B’. If a k-hook is removed from |\],, leaving |u], and if the 
corresponding kq-hook is removed from [h] leaving [X], then [X], = [x]. 


New Proof. We shall study the effect of the respective hook removals on the 
hook graphs of [A] and of [A],. Let [X] be obtained from [A] by removing either 
the right kg-hook with corner at the ij-node or the corresponding 7j rim hook. 
We may obtain H{[X] from H[A] in three steps. 
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1. We delete from HA] the kg hook numbers in the ij right hook. 

2. We diminish by kg each of the integers h,, (s < i) standing above h,, 
and move this reduced jth column of H[A] past the \,—j columns of the hook 
arm to form the A,th column of HX]. 

3. We diminish by kg each of the integers h;, (t < 7) standing to the left of 
h,; and move this reduced ith row down past the A’, — 7 rows of the hook leg 
to form the i’ jth row of HX]. 


The effect of these three operations on the g-nodes of [A], which are the nodes 
of [A],, is to remove the k nodes belonging to the corresponding right k-hook 
of [A],, to reduce by & the hook quotient numbers h;,/q or k,,/q of g-nodes 
above or to the left of 2;,/¢, and to move them past the arm or leg of the k-hook. 
Hence [X], = [x]. 


Example 2. To illustrate the effect of hook removal on the hook graph, we 
remove from H[7,6,5,3] a right 6-hook with corner at the 23 node. The six 
rim nodes are shown by dots at the right. Then we form the 2-quotient hook 
graph H[A]2 and remove the corresponding 3 hook. 


H{a] H{Xx] 
100986531 100926531 10965321 
876431 — 6521 
65421 65/21 32 
321 32 21 
{As H{X]2 
5 43 5 138 5 3 1 
4 32 om iz 
3 21 3 1 1 
1 1 1 


4. The disjoint constituents of the g-quotient (or star) diagram. It is clear 
from 1.2 that two g-nodes in the same row or the same column of [A] have the 
same residue r (mod g), where 1 < r < g. We shall call the ith row a (q, r)-row 
and the jth column a (q,7)-column if and only if i and j satisfy 1.2. The 
(q, r)-nodes of a right diagram of [A] (if any exist) are those in [A] which lie at 
intersections of (g, r)-rows and (g,7)-columns of [A], and they form a right 
diagram [A],,, which is a disjoint constituent of [A],. As r varies from 1 to g we 
obtain at most g such constituents, but some may be vacuous. Thus we obtain 


THEOREM 3. The g-quotient diagram |\), derived from a right diagram [)] is 
composed of at most q disjoint right diagrams, of which the rth is composed of the 
(q, r)-nodes of |r] if any exist. 


It is easily seen that one or more rows (or columns) of any of the disjoint 
constituents of the star diagram may be moved past any or all of the rows (or 
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columns) of a different constituent without affecting the hook numbers of this 
diagram or most of its other essential properties. However, among equivalent 
star diagrams of [A] the easiest from which to see the one-to-one correspondence 
of its k-hooks with kg-hooks of [A] is the g-quotient diagram [A], defined above 
in §3. 

Staal’s Theorem C follows immediately from our Theorem 3. His 4's are our 
first row hook numbers hy,, his A* our [A],. 


STAAL’s THEOREM C. Gather the h,,'s of [A] into classes which are congruent 
(mod q). For each such class of congruent h,;s form the diagram having these as 
first row hook numbers. The diagrams thus formed will be the constituents of the 
star diagram [)],. 


New Proof. We see from 1.1 and 1.2 that 
4.1 hiy = Xx. — r (mod g) if and only if 7 — \’, = r (mod g). 


Hence the g-nodes in columns headed by hook numbers congruent to A; — r 
are (g, r)-nodes, and they form the rth constituent of [A],. If we form a new 
hook graph H[u] by retaining only those top hook numbers in H[A] that are 
congruent to A; — r (mod qg), then by Lemma 1 the column of H|y] headed by 
hy, will have in addition to the hook numbers of the jth column of H[A] those 
numbers hy, — hy, such that A, > A, but 


hi; = hi: (mod q). 


No new g-nodes are present, so the g-quotient [yu], is equivalent to [A],,-. 


THEOREM 4. If p and q are any integers and [d] any diagram, the pq-quotient of 
[A] is the q-quotient of [X]». 


Proof. This analogue of Robinson’s theorem (10) for the star diagram is a 
trivial consequence of the fact stated in Lemma 2 that pg-nodes of [A] corre- 
spond to g-nodes of [A],. 


Example 3. We illustrate Theorems 3 and 4 by showing the hook graphs 
H{A], for [A] = [8,7,5,3,2] and g = 1, 2, 3, 4, 6. Residues (mod 12) given at the 
left should be reduced (mod q) to identify the various disjoint constituents. 
Dots are for spacing only. 








r q=1 q=2 q=3 q=4 q=6 
8 ier eeet Ss. cee Baweceok Bese sk Bae el 
6 10 975421 Suwanee oe i oa <e 

3 7 6421 321 2 oe of 

12 4 31 2 a l 

10 2 1 1 


5. The p-exponent in the degree. For any irreducible representation [A] 
and for any prime ?, it is easy to determine from the hook graph the exponent 
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e(fx) of the highest power of » dividing the degree f,. In fact, equation 2.4 
shows immediately that 


5.1 e(fx) = e(m!) — e(Ay). 


A similar formula holds moreover, by Corollary 1, for the degree f,* of the 
reducible representation of S, associated with the p-quotient diagram [A],, 
namely 


5.2 e(fr*) = e(b!) — e(Hy.,). 


These facts make possible a simple proof of Robinson’s version (1) of Naka- 
yama’s formula (7), given as Theorem A in Staal’s paper (12). Other proofs of 
this result have recently been given by Nakayama and Osima (8) and Farahat 


(2). 
THEOREM A. If a denotes the number of nodes in the p-core of {d| then 
5.3 e(fx) = e(n!) — e(m — a)! + e(f,*). 


Proof. Since n — a = bp, equations 5.1 and 5.2 enable us to rewrite 5.3 in 
the form 


5.4 e(Hy) — e(Ay,») = e((bp)!) — e(b!). 


Each side of 5.4 reduces to 5, since there are exactly 5 explicit factors in the 
indicated products H, and (bp)! that are divisible by », and the quotients of 
these factors by p are the factors of H),, and b! respectively. 


6. Construction of the g-core. The g-core of [\] is the diagram [a] that 
remains after all g-hooks have been removed. Its partition numbers a, can be 
constructed from the hook numbers hk,; as follows: 


THEOREM 6. Let 8, be the number of q-nodes in the ith row of |X], and let a, 
be the number of nodes in the ith row of the q-core of [\]. Then the numbers hi—q8; 
are distinct non-negative integers that form a permutation of the integers 
a,— i+ X’4, of we set a, = O for i > a’;. The sign customarily attached to the 
q-core is the sign of this permutation. 


Proof. The effect on the hook graph of the removal of a kg-hook was de- 
scribed in the proof of Theorem B’ (§3). The effect on the first column hook 
numbers hj of removing from [A] a g-hook whose corner is not in the first 
column, is simply to diminish one of the 4, by g and rearrange the order of rows. 
Let all such be removed from [A], beginning with the bottom rows and working 
up, so the only remaining g-nodes are a set of k contained in the first column. 
Their hook numbers are gq, 2q,...kg, counting from the bottom. Reducing 
each of these by gq is equivalent to deleting the number kg. The number of rows 
lost in removing the final kg-hook is the smaller of the two. numbers kg and 
hy, — hy and this is equal to \’; — a’;. If kg is the smaller, the first column 
hook numbers which are greater than kg are each reduced by \’; — a’; = kg 
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and become the hook numbers a; — i + a’, of the g-core. If hi; — hy, = d < ke, 
the second column hook numbers Ay, — d (for hy # kq and hy > d) become 
the first column hook numbers a; — i + a’; of the g-core [a]. In each case the 
numbers hy — g8; are distinct and form a permutation of the integers 
a, — 7 + X’,, and the row numbers a, can be calculated. The sign attached to 
the combined permutation of rows is the factor by which the Young Sym- 
metrizers NV, are altered in the reduction process, and this sign is customarily 
attached to the q-core. 


Example 4. Find the 3-core of [A] = [9,7,4,3,2]. 








H{]) | ha-—qBy | ay—it+n's| ay | ay—ita’; 








| Hla) 
131210865421; 7 | 7 \|7-4—3] 4 421 
10 9 75321 | 4 4 \4—3=1]| 1 1 
65 31 | o | 2 |2-2=0| | 
431 ee. | 1 |1i—180) 
21 / 2 | @ 0-0=0) 





7. Leg lengths. In the computation of characters (9) it is the leg length 
rather than the class of a hook which is important. So far we have attached 
significance only to (q, r)-nodes and their alignment in the rows and columns 
of H[A]. We state the following 


THEOREM 7. The leg length of the ij-hook in [d] is the number of missing 
integers less than h,, and to the right of it in HX). 


Proof. Since each such missing integer indicates one step down in the rim 
of [A], the total number of such steps down is the leg length in question. 


Nakayama studied the effect of interchanging the order of removing two 
successive hooks in some detail (7, I, §§3, 4). It will be sufficient if we consider 
only the “interlocking” case of an ij-hook and an st-hook, such that A, > #, 
i> s,j< +t. Applying the three steps of §3 it is evident that removing the 
ij-hook first shortens the leg length of the st-hook by 1, while removing the 
st-hook first lengthens the leg length of the 7j-hook by 1. This makes explicit 
the consideration of this same problem in (9, p. 289). If one hook is completely 
contained within the other the problem does not arise, and if the two hooks do 
not interlock then the leg lengths are unaffected. 
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SUBSTITUTION GROUPS OF FORMAL POWER SERIES 
S. A. JENNINGS 


In this paper we are concerned with the group G@ = G(R) of formal power 
series of the form 


f(x) = x + age? + ax? +..., 


the coefficients being elements of a commutative ring R and the group opera- 
tion being substitution. Little seems to be known of the properties of groups 
of this type, except in special cases, although groups of formal power series in 
several variables with complex coefficients have been investigated from a 
different point of view by Bochner and Martin (1, chap. I) and Goté (2). 

We study first some of the relations between properties of R and of G, and 
show in particular that @ may be topologized in a natural way, so that infinite 
products may be introduced in G. In §§3 and 4 we consider the case where the 
coefficients lie in a field F of characteristic 0: a reasonably complete discussion 
of the structure of G(F) is obtained and it is shown that a Lie algebra can be 
associated with @(F) in a natural way. In particular we show that G(F) is 
generated by two one parameter subgroups. Finally, a brief discussion is given 
of some of the properties of G@(J), where J is the ring of ordinary integers. 


1. Groups with a general coefficient ring. Let R be any commutative, 
associative ring, and let x be an indeterminate which commutes with every 
element of R. We consider the set of formal power series f(x) of the form 


(1.1.1) f(x) = x + ax? +... + a,x" +..., 
where do, ...@,,... are elements of R. If g(x) is another such power series 
g(x) = x + box? +... ox" +..., 


then, substituting formally, it follows readily that 


g(f(x)) =x+ dic.x’, y=2,3,.. , 
where 
Co = a2 + dr, 
0.2.5) Cr = a,+b,+ > b.¢s(a2,... as), 
s= 2,3,...7—1; pm S408 
and ¢, is a polynomial in a2. . . a, with integral coefficients of degree at most s, 
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without constant term. We note that ¢, is independent of the nature of the 
ring R. 

Every f(x) of the form (1.1.1) defines a mapping F of the set of all such 
power series on itself, via the substitution x — f(x). If r(x) is any such power 
series, we define F to be the mapping: 


F : r(x) > r(f(x)). 


We indicate the mapping which is determined by f(g(x)) by FG : x — f(g(x)). 
The associativity of this multiplication is automatic. The existence of an 
inverse to F : x — f(x) follows from (1.1.2): for consider the function 


F(x) =x+ Dax’, r= 2,8,..., 
where the coefficients @, are defined inductively by 


a = — a2, 


(1.1.3) 


r—1 
a, = — a, — >, G,¢,(a2,... a), ee 


Clearly 

F(f(x)) = x, 
so that the mapping F-! : x — f(x) is the inverse of F : x — f(x), and f(f) = x 
also. We have established, therefore, 


THEOREM 1.1. The mappings F : x — f(x) of the set of all formal power series 
x+ > a,x,, Ls. 
with coefficients in an arbitrary commutative ring R, into itself forms a group ©. 


When necessary to stress the dependence of G upon the coefficient ring R we 
write @ = G(R). Occasionally in what follows we will write (1.1.1) in the form 


f(x) = x(1 + aex + ayx? +...) 


but this will be a matter of convenience involving no assumption that R con- 
tains a unit element. If R has a unit element 1 we identify 1.x and x, but if not, 
the element 1 which appears above can be considered as a unit formally 
adjoined to R in the usual manner. 

We consider next subgroups of G(R) of the type G(S), where S is a subring 
of R. The elements of G(S) are of the form 


G:x>x+ > dx’, p= 2,3,... 


with 5, € S. These elements form a subgroup @(S) of G(R) which is proper if 
S is a proper subring of R. If S is an ideal of R, consider F-'GF where G € @(S) 
and F€ G(R). We have 
GF :x — g(f(x)) 
where 
e(f(x)) = f(x) + Oof?+...+5,f" +... 
= f(x) + g’ (x) 
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and g’(x) is a power series of the form b’.x* + b’sx* + ..., all of whose coeffi- 
cients are in S, since S is an ideal. Hence 


FGF :x >f(g(f)) =f + 2’) 


IF+e)=ftetalite,+alh+e)'+... 
= f(f) + 2a2(fe’ + 2”) +... 
= x + box? + Ox? +... 


and 5’’s, b’’;,... are in S. Hence if S is an ideal, 
F-'GF : x — g’’ (x), 


where all the coefficients of g’” are in S, and hence F-'GF € @(S) and G(S) is 
a normal subgroup of @(R). 

If S, T are two ideals of R with S-T = 0, then G(S) and G(T) permute 
elementwise. For if 6, € S,c,€ T, then b,c, = 0 and hence by (1.1.2), if 


g(x) =x+ > dx’, h(x) =x+ > cx’, 
h(g(x)) = g(h(x)) 


= xt (be +2) x +... + (be +c) x’ t+.... 
In particular, if R = S @ T, then it follows at once that 


G(R) = G(S) X G(T). 


The above may be summarized in 


where 


THEOREM 1.2. The set of elements x + =b,x" with coefficients in a subring S 
of R defines a subgroup @(S) of G(R). If S is an ideal, G(S) is normal in G(R). 
If S and T annihilate each other, then G(S) and G(T) permute elementwise. In 
particular, if R = S @ T, then 


G(R) = G(S) XK G(T). 


If R is a homomorphic image of R, we consider next the relationship be- 
tween G(R) and G(R). 


THEOREM 1.3. If S is an ideal of R, and R = R/S, then 
G(R) > G(R)/G(S). 
Proof. Consider @(R) : an element F is of the form 
F:x—-x+ > ax’, a,€R; r = 2,3,.... 


Now if in the homomorphism of R onto R, a, > d,, then we have also a map- 
ping G(R) — G(R) via 


{[F:x—-x+ Dax} o{P: x>x4+ > ax’) 


which is a homomorphism, since, by (1.1.2), if 











328 S. A. JENNINGS 


then 
fa, +b, + >) bps(a2...0,)) > (4, +6, + Do bo... 4,)]. 


That is, if F-+ F and G— G then FG — FG. The kernel of this homomor- 
phism consists of all elements of @(R) with coefficients in S, that is, G(S), and 
hence 

@(R)/G(S) = G(R) 


as required. We may therefore write 
G(R/S) = G(R)/G(S). 


The following considerations throw some light on the nature of the group 
@(R). Let us assume that R has a unit element, and let P be the ring of all 
formal power series in x with coefficients in R. A typical element of P is then 
of the form 


P(x) = ro + ix + rox? +..., fo, 71, %2,... ER. 


The set M, of all power series in P of the form 


bilx) = ree treat +... 


is an ideal of P for all i = 1,2,... and P/M, = R. We consider (3, p. 117) 
the automorphisms of P over R, that is, those automorphisms which leave the 
elements of R fixed, and in particular the subgroup %, which leaves the ele- 
ments of M; modulo M; unchanged. If A is any element of A, then the mapping 
A of M; will be completely determined by a knowledge of what happens to x 
under A. However, since M; modulo M,j is fixed under A, we have 


A :x—>x+ po(x) = x + ax? + ayx? +... 
and for any element p;(x) € P, 
A : p(x) — pil(x + po(x)). 


The mapping A, therefore, is precisely an element of G(R), and conversely any 
mapping 
F :x— f(x) 


of G(R) gives rise to an automorphism of Y%, 
F : p(x) — pi(f(x)). 
We have thus proved 


THEOREM 1.4. The group G(R) is the group of relative automorphisms of P 
over R which leave M, modulo M,z fixed. 


We note that M;/M,,4: is a nilpotent ring, and the homomorphism 
M; — M;/M,.4: may be realized by setting x**' = 0 in all calculations with 
elements of M;. The group of automorphisms &’ of M;/M,,4; over R which 
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leaves the elements of M;/M,; (mod M:2/M,,,;) fixed consists of the mappings 
P sxx + ax? +... + ,x" = f(x), 
G:x-x + box? +... + Dax” = G(x), 
etc., where FG is obtained by forming f(g(x)) and setting x"*! = 0 in the result, 
viz., 
FG :x —f(g(x)) (mod x"*'), 
2. Commutators and the subgroup topology. The commutator structure of 
G(R) is revealed by considerations involving a different type of subgroup. For 
given R, let G,(R) be the set of all elements of the form 
Fixx + Gra xt! + ano x7t? +.... 
From (1.1.2) and (1.1.3) it follows that the set G, is a subgroup. If G:x —~ x 
+ <Xc,x* is any element of G(R), it is easily verified that G-'F,G is given by a 
series of the form 


XA Oey r xt! + a’ pox? +..., Gus ER, s> 2, 


so that G, is normal in @ = G,. Indeed, we have the descending chain of 
normal subgroups 


(2.1.1) G = G,56,5)6;>5.... 


Let F,€ G,, G, € G,. Then we may write 


Fy: 2% + Grae” + aye’ +... ex tx", 
Fr*: 2 — Gyyae’™ + Gy +. ee tt, 
G,: xX + baw” + bat” + ooe =X + x*t*y, 
Gots wx — baa”? + Bye? +... =e $0, 


where f = G41 + @p42x% +..., etc. We assume a,, b, ~ 0. Then an easy but 
somewhat tedious calculation using (1.1.2) and (1.1.3) shows that 


(F,, Gs) _ F;'G;" F,G, 


is given by 
(2.1.2) x + (r = s) Or+1 rel grees + Cr+s+2 gitets + re | 


where the c, are polynomials in the a; and 5,. It follows that 


(2.1.3) (G,, G,) G. Gis, 

and indeed, if r = s, (G,, G,) [ Ge,41 since in this case, as is easily verified, 
(F,, G,) =x at (r a 2) (a++2 bras — Ara b,+2) x2rt2 a oeey rT > a 

and 


(2.1.4) (Fi, G;) = x + (ash. + ab; — deb? — a2*b2) x1 +..., r= 1. 


In particular, since (G,, @) C G,4:, the chain (2.1.1) is a central series of G. 
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We note too that the only element common to all the subgroups G, is the 
element x — x, the unit element of G. 


THEOREM 2.1. The elements of @(R) of the form 
Fi, :x xt ayix7t'+..., Se 


form a normal subgroup @, (R). The descending chain (2.1.1) is a central series of 
@ with the stronger property (2.1.2) and & is generalized nilpotent in the sense 
that the intersection of all terms of the central series (2.1.1) consists of the unit 
element. 


We may introduce a subgroup topology in G by taking the normal subgroups 
{G,} as a system of neighbourhoods of the identity in G. In particular, if G, 
is a sequence of elements of G, we will say that lim G, = G, where G € G, if 
for any integer N we can find another integer Ny such that G, = G mod Gy(R) 
for all m > No. The group @ topologized in this fashion is 0-dimensional, and 
the subgroups G, are both open and closed. Indeed, we remark that any sub- 
group of the form @ (S), where S is a subring of R, is closed in this topology. 
For if G is a limit point of G(S), there exists a sequence of elements H;, Ho, ... 
belonging to G(S), and such that 


lim H, = G. 
That is, for given N, there exists No such that 
G = H, Gy 


for all r > No, or in other words, the coefficients of x?, x*,.. . x" of G are the 
same as those of H, for r > No, and since these last belong to S, the coefficients 
of G all belong to S, N being arbitrary. 

We consider now the factor groups G/G,. If F is any element of G, with 


F:x—-x+ ax? +...a,.%*° + aie +..., 


we show that there exists an element F, in G, such that if F is given by 


(2.2.1) P: xx +ax'+...+4 x" =x+f, 

then r 

2.2.2) F = F F, 

For, since @ is a group, we may solve the equation F = FX and get 
X:x—-x+ > cx" = x + g(x), em ES. i005 


Substituting in (2.2.2) we get 
x+ y ® ax’ =x + g(x) + a(x +g) +... +a,(x +g)" 
and since x + ¥ a,x* = x + f + aqui x1 +... we have 


(2.2.3) x+ f + daietti+... = xt+ f + g(x) 
+ a2(2xg + g?) +... + 4,(mx*"'1g +... + g"). 
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Comparing both sides of (2.2.3) we see that 
Angi Xt! +... = g(x) [1 + 2aoxg +...) 
and hence cz = ¢c3; = ...¢, = 0, which establishes (2.2.2). 
Using the coefficients in 
Fo :x-—x+ Gx? +..., 
we set 
Fo =x+ Gx? +...+ 4,2"; 
then by the above there exists an element G, of G, such that 
F-! = F°G,, 
and hence F - F° is in G,, that is, 
F° = F-' = F-! (mod G,). 
Similarly, if G = G (mod @,), with 
G sxx + box? +... + 5,2", 
then 
FG=QA (mod G,), 


where A is obtained by substituting 


x + box? ++... + 5,x" 


into 


x + ax? +...+ 5,x" 


and setting x"*! = 0 in the result. It follows that G modulo G, is isomorphic 
to the group %’ discussed in §1. We have therefore proved 


THEOREM 2.2. The group @/@G,_1 is isomorphic to the group X' of relative 
automorphisms of the ring M/M,., which leave the elements of M/M, invariant. 

We observe that, both because of (2.1.2) and also by (2.2) the group G/G, 
is nilpotent, in the usual sense, and of class n — 1 at most. Indeed, @/G; is 
abelian. 

We investigate now the orders of elements of G. 


THEOREM 2.3. If a is an integer, and if F is given by 
Fixx + api x7t' +... with a4, ~ 0, 


then F is given by 
Fe :x—x + adi x7ti+.... 


In particular, if a,,; is such that aa,,,; = 0 implies a = 0, then F is an ele- 
ment of infinite order. 


CorROLLARY 2.4. If R is of characteristic zero, (that is, if aa = 0, a #0 
implies a = 0 for all a € R) then every element of G(R) other than the identity is 
of infinite order. 
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THEOREM 2.5. If R is of prime characteristic p (i.e., if pa=0 for all a€ R), then 
o” - G,», 
that is, the pth power of every element in G, is in G,». 


Theorem 2.3 follows at once from (1.1.2) while 2.4 follows from 2.3. We will 
prove 2.5 by establishing first the following: 


LemMA 2.6 If f, = 1+ ayx'+..., and f,=1+ dx’ +..., where a,, b; 
are in R, then there exists an f,+,, 


free = Ll tew't*+..., 
with coefficients in R such that 
fs (xf,) = | F “Stross 
where f, . fr+5 1s the formal product of the series on the right. 
Proof of 2.6. Consider 
(2.6.1) 1+ ayx*(1 + dye? +... .)° + aox**(1 + aye’? +... T+... 
=1+aynx' +... + ant! + sajydyx"t* +.... 

On the other hand, 
(2.6.2) (l+ayn~'*+... + a,.%""! 4 ayy’ +...) (1 + ee"? 4+...) 

= 1+ ayxe* +... faett! + (@egr + 1)et* + Grae + ext! +..., 


so that, equating coefficients of x*+’, x*+’*+!... in (2.6.1) and (2.6.2) in suc- 
cession, we may obtain the c, as polynomials in the coefficients a, and },. 


Proof of 2.5. Let F,:x—x + a,4,x"t! +... be any element of G,(R). 
We set 


B+ Gy, xt! +... = xf,(x), 
where f, = 1 + a,4; x’ +... is as in Lemma 2.6. Then 
Fy: x xf f(xf;), 
and by (2.6) there exists an fz, = 1 + b2,x®’ +... such that 
Fi: x — xf; + for. 
We prove by induction that, for a an integer, 
(2.6.3) Ft: xf) 92D.) 
where f,, is defined inductively by 


Saws (xf,) = fa-wr(x) «Ser 


and is therefore of the form 


fer =~ Ltd x’ +..., dye, Qarar,... ER, 


and (;) is the usual binomial coefficient. For if (2.6.3) holds for given a, then 





— — ee 














eee 
‘ 
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Fo om FEF: xf, f™ (f,).. © (fp). SD Of) 


wm of I GIG) GG) gen) 


and hence 
a+l1 a+l 
Fatt, pipe ft) C2), . 
which establishes (2.6.3) for all a. 


Now if a = p, a prime, we have, since p divides all of 


oe.-..42,). 


, Br-. 
(2.6.4) FP: x — xf? fer? . . fee fori 
where the 6, are integers and hence, since if R is of characteristic p, 


(1 + dip x’ +...)? = 1+ d%,2°7+..., 
we see that F,” has the form 


CP) 


fi 


FP :x—x(1l+ex7+...), 
=x+e,x7t +..., 


where ¢;, é2,... depend on a,4:,.... Hence F,? € G,, if F, € G,. 


CoROLLARY 2.7. If pR = 0 and r < p* then G/G, is a group all of whose 
elements are of order at most p*. 


THEOREM 2.8. If R is nilpotent, with R* = 0, then the lower central series of 
@ is of finite length, that is, G(R) is nilpotent in the usual sense. 


Proof of 2.8. As in (2.1.2), we may readily verify that the mappings 
F:x—-x+anx?+... 


and 

G:x—xt+ box? +... 
yield 
(2.8.1) (F,G) = x + cqx* + cyx' +..., 


where each coefficient in (2.8.1) is the product of two or more coefficients 
a,, b, Hence 
(F, G) € G(R’) 
and in general we verify that 
(2.8.2) C, € G(R"), 


where C, is any commutator of weight m in the elements of G(R). If R® = 0, 
then C, = 1 as required. Indeed, we remark that, more generally, if S and T 
are ideals of R, and G,(S) € @,(S), G,(T) € G,(T) then 


(2.8.3) (G,(S), G.(T)) € G45(S-T). 
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For satisfactory definition of the lower central series of G(R) we introduce 
the notion of a topological generating set. Let K be any set of elements of G(R) 
and let R be the smallest closed subgroup of @ containing K. Then we say 
that R = [K] and that & is generated (topologically) by K. In particular, if 
4%, B are normal closed subgroups of G, we define [M, B] as the smallest closed 
subgroup generated by all elements of the form (A, B) = A-'B-'AB where 
A€ Wand BE &. In general [A, B] is normal, since if 


C = lim G, 
then 
G-! CG = lim (G"' G, G). 


In our case [M, B] is a proper subgroup of both A and &, since if A C G, but 
AZ G1, VS G, but BV J G41, then (A, BV] S G,,.,, so that (M, VB] + A, B. 


We may now define the lower central series of G(R): 


(2.9.1) @=%, 58,5...) 89:)... 


by setting G = §,, and Hi: = [H,, G] for i > 1. By the above, §, .. G,(R) 
and hence in the series (2.9.1) (\ S; = 1, so that G is generalized nilpotent in 
the usual sense. 


3. Power series groups over a field of characteristic zero. If the ring R 
satisfies suitable chain conditions then it has a nilpotent radical N and R/N is 
a direct sum of fields. In this case G(V) is a normal subgroup, which by (2.8) 
is nilpotent in the usual sense, while by (1.2) G(R/N) is a direct product of 
groups @(F;,) where the F;, are fields. For groups of the type @(F), where F 
is a field, two cases arise, according as F is of characteristic 0, or characteristic 
pb ~ 0. We consider in the present paper only the case when F is of character- 
istic 0. The case where F is of characteristic p will be discussed elsewhere. 

In what follows in the rest of this section, therefore, the coefficient ring will be 
a field F of characteristic zero: we will write © = G(F). Our first result deals 
with certain one parameter subgroups of @. 


THEOREM 3.1. Let G,(a) be the element of G(F) defined by the mapping 


x 
(3.1.1) Gy (a): © 7 haat) 


wherea€ F,k = 1,2,...and 
‘ (i+ Ra's" 

‘ _ n nk+1 
. - GEOG 4+ 2k)... A+ (m Ika es 


n!' 


(3.1.2) ke = x + ax**? 





is the power series, with coefficients in F, obtained by expanding x(1—kax*)-!/* 
formally by the binomial theorem. Then for fixed k the set of elements {G,(a)}, 
as a runs over F, is an abelian subgroup of & contained in G,, with 
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G, (a) - G, (8) G,(a + B), 
(3.1.3) G,(0) = 1, 
Gi(—a) = G;*(a). 


Proof. The fact that G,(a) G,(8) = G,(a@ + 8) may be verified by direct 
substitution in (3.1.1). 


THEOREM 3.2. The one parameter subgroups {G,(a)} k = 1,2,... determine 


a uniqueness basis for & in the sense that every element G of © may be written 
uniquely 


G = G;(a;) Go(az)... G,(an).... 
The infinite product on the right is to be interpreted as 


lim Gy(q@y) .. . Galan) 


Ro@w 


in the subgroup topology. 
Proof. Let G be given by 
G:x—-x+ax?+...+an°+.... 


It will suffice to prove that, for any m, there exist elements a, ao, ... a, of F 
such that 
G = G,(a;)...G,(a,) (mod G,,41). 
For n = 1, it is clear that 
G = G,(a2) (mod G:) 
and a; = a». Assume that for given & there exist coefficients &, . . 
are polynomials in do, ... , 41 so that 


(3.2.1) G => G,(&) eee Gi (&) (mod Gis1). 
If we consider 


. »& which 


G,(a}) eee G;. (ax) >t a Box? + eee + Brea sx**! a Bus ox**? 4. eee 


for a, ... , a, arbitrary, the coefficients 8 ,, for alli, are polynomials in a, . . . a. 
By our induction, when 


a, =, B= d2,... Beri = Gey: aNd Pyro = Beye 
say, where A,42 is now a polynomial in &...@, and hence in do, .. . @p41. 
Consider now 
Gi (a3) eee G,. (ax) Gear (otp41) oe Be EF 1 - Box? + see 
+ Bear x8! + (apis + Boze) PHP +.... 
Define G41 = G42 — Beso. Then & 4: is a polynomial in ae, . . . , @.42 and when 


Qo = Bo,... Ay = By, One. = Bai, we have 


1\Qly) « « + Urge \ ee) - —> 3 ox? eee k+l k+2 2 cece 
Gi(&) Gai (Ge) 2% 3X + ox? + Hayy xt! + ayy ox*®*? + 
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Our induction is complete, and in particular (3.2.1) holds for all k, which proves 
our theorem, and shows indeed that the aj, .. . a,, ... of (3.2) are such that 


ao = a& + Pr (a2, oo +» Gp-1), 


where ~, is a polynomial with coefficients in F. We note in passing that (3.1) 
and (3.2) may be generalized to groups G(R) if R is a ring of characteristic 
zero which admits the rational field. 


THEOREM 3.3. @ is generated (topologically) by the subgroups {G,(a)} and 
{G2(8)}. 


THEOREM 3.4. Every element G given by 
G:x—mxtanyg’t?+..., oo 
can be written as an infinite product of the form 
G @ C, Cogn. ees 


where each C, is a commutator of weight k +- 1 in elements of G,(a) and G2(8) 
of the form 
Cy = (G2(8), Gi(ar), Gila) . . . Gi(ax)). 
Coro.iary 3.5. G = §;, Gyr = G,, s = 2,3,..., where $1 DH2D... 
is the lower central series of & in the sense of (2.9.1). 


Proof of (3.3)-(3.5). By (2.1.2) we have 
= (G2(8), G;(a)) = x + a,Bx* + nae 


and in general, if 


Cy = (G2(8), Gi(a;), Gi (az) ee | Gi (a,)), 


then 
Cy: x—>x + Ria... a, Bx**? +.... 


Without giving details we note that we may proceed in a perfectly straight- 
forward manner as in the proof of (3.2) to show that if 


G:xoxtanywt+..., Pe bh Ziccs 
then it is possible to choose 8, a1, a2,... so that, for all integers =0,1,2,..., 
Gm CLs... Core (mod G,+,+3). 


However, this will imply that 

G = lim C,Cyai. ~~. Corin 
in our topology, and establish (3.4). To prove (3.3) we remark that for any G 
given by 


G:x—-x+ ax? + axx'+..., 


we may write 


es 


a, 
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(3.5.1) G 


since 


Gi(a2) G2(a3 — a2) (mod @;) 


Gi(a2): xx + ax® + axe’ +... 


and 
3 


Pew 


G:(a3 — a3): x +x + (a; — a3) x 
and therefore 
Gi(a2) Go(a3 — a3): x > x + ayn? + ag’ +e + ex’ +..., 


where é4, és, . . . depend on a and a3. 
From (3.5.1) there is an element G; of &; such that 


G = Gi(az) G2(as — a2) G; 
= Gi(a2) G2(as — a3) CsCy... 
by (3.4). 

To prove (3.5) we observe that we have proved that every element of G,,, 
belongs to §,, s > 2, since any G,,, is expressible as a product of commutators 
of weight at least s in the elements of G. If H, € §,, then let 

H, = x + ax" +.... 


H, is either a product of commutators of weight s in elements of G, or the 
limit of such a product. However, any commutator of weight s is in G,,;, and 
and hence ¢ > s + 1, that is §, = @,,; and we have our corollary. 


4. The Lie algebra of a group @(F). To conclude our discussion of groups 
@(F), where F is of characteristic 0, we remark that we may associate with @ 
a Lie algebra 2% over F which determines @ in the usual fashion. For consider 


the algebra & whose basis over F consists of the operators x*D, x*D,..., 
where D is the formal operation of differentiation with respect to x. Let us set 
m=x*D p= x*D,..., we = x**'D,.... 

Then 
(4.1.1) [uss My) = Baty — Bye = (FG — 2)x'D = (9 — dw, 


for 4,j = 1,2,.... 
The Lie algebra % spanned by the yz, over F is generalized nilpotent in the 
sense that the chain of ideals 


fo ee) pe eee) ee Pe 
where %,4; = [%,, 2], has intersection 0. &, is spanned by the elements 
Mest» Mea, «-- - We may therefore introduce infinite sums of the type 
(4.1.2) A= ay + ame t..., a,€ F 


into 2, and consider, formally at least, the differential operators 
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expr=(1+A4+~+4...), AEL. 
If we form 
(exp A)x 
we obtain 
(4.1.2) (exp A)x = x + ax? +..., 
where for k = 1, 2,..., the coefficients a,,; are polynomials in a, a, . . . , a. 


That is, to every \ in 2 we may associate a G(A) € G, G(A) : x — (exp A)x. 
Since the coefficients a2, a3, . . . are determined as functions 
Api = Ox+1(@1, | Gy), 


we may consider the a, a2, ... as canonical parameters in @: they have the 
familiar property that if G(A) is the G determined by A above, then, for all 
t,r€ F, 


G(td\) G(rA) = G((t + 7r)A): 
that is, the set G(fA) is a one parameter subgroup as ¢ runs over F. In parti- 
cular the one parameter subgroups (3.1.2) obtained earlier are given by 


G,(a): x — (exp ay,) x = x(1 — kax*)—** 


We note too that if A, € &,, then (exp A,) x = H,; is in H,; = G41; that is, the 
ideals %, determine the lower central series of G. 

If we adjoin the operator uo = xD to & we get a Lie algebra %* spanned by 
the elements yn, (Rk = 0,1,2,...) which has & as its derived algebra. For 


If F is a field in which é is defined for all a € F, then the element 


r* = ano tawit... 
determines, via the operator exp (A*), a transformation 
x— (exp A*) x = bx + DO dx’, $ = 2,3,... 
where 
b, = ew =f (), 
Even if exponentials are not defined in F, it is easy to verify that the group 
of transformations @* consisting of all mappings of the form 
x bx + > dx’, b, ~ 0, 
is such that 
(G*, G*) = G. 


This group @* is the group of all automorphisms of M, over F, as in (1.4) 
above. 


5. Groups with integral coefficients. We conclude this paper with one or 
two remarks about the group (J) where J is the ring of integers. For conveni- 
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ence we write § = @(J). Let p be any prime, and let J, be the ideal (p*) of all 
integers divisible by ~*,a = 1, 2,.... If we set J, = J, then corresponding to 
the ideals 


I=2e],)Dh D1... 
we have the chain of normal subgroups 


(5.0.1) 3$=PBPOBDBO..., 
where $, = G(J,). If G, is any element of $, then 
Ga XX + Gox? + ax? +..., 
where p*|a, for all k = 2,3,.... Now by (1.3), 
G(Io/I1) = G0) /G(h), 


and since J)/J; = GF(p), we see that the factor group $/{; is a group G@(F), 
where F is the prime field of characteristic ». Groups of this type have a rather 
complicated structure, some indication of which will be given in a later paper. 
The structure of the group §; is simpler to consider, and we prove, in fact: 


THEOREM 5.1. The groups (5.0.1) have the properties; 


(1) (Ba Bs) < Bats, a, B = 0, 8 2 gp ecee 
(2) If G. € Ba, where a > 1, then 
Ge € Bait. 


Proof. lf Ga € Ba, Ge € Pe, then we have 


Ga 1% 7X + Geox? + agx'+..., 
Gg :x—>x + box? + Dbyx? +..., 


where p*|a, and p*|b,, s = 2,3,.... Then 


GaGg 2% > % + Cox? + cx? +... 
yields, by (1.1.2), 


Co = a2 + dz, 
Cr = a, +b, + Dy asds(bo,..-, bs), 
s=2,3,...7—1; y= 2,3,.... 
Now since p*|b, for all s, p*|¢, for all s, and hence p*+4la., for all s. Hence 
(5.1.3) c, =a, + b, (mod p***), At TTT 


If (5.1.3) holds, and if C is given by 
C:x—x + (a2 + b2)x* + (a3 + 3x? +..., 


then G.Gs = C (mod P.4s); for in the homomorphism @(J) — G(J/J..s) 
induced by J — I/JIa+s, as in the proof of (1.3), the elements G,Gs and C map 
into the same element in G(J/J.i¢). However, if we form GsG, we have 
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GgGa: Xx > x + ex +c +... 


where 
C = db. + az, 
ch =b, +a,+ >, bso,(a2,..., 4s), 
so that 
c=a,+0,=c, (mod p***), 
and hence, as before, 
G3 Ge = C = G, Gz (mod $B..s), 


which proves (1) of (5.1). To establish (2) we form, for a>1 and any integer n 
Go" 2% —> x + Co(n)x? + c3(n)x? +... 
and observe that, by (1.1.2), 
c,(m) = na, + ®,(n), eandéd.... 


where ®,(m) is a polynomial in a2, a3,...,a, whose term of lowest degree is 
at least of degree 2. Now if p* divides ae,...,a,, p* divides ®,(m) and a 
fortiort so does p**'. Hence 

c,(n) = na, (mod p**'), 


c(p) =0 (mod p***), 
so that G.? € Bai: as required. 


CoROLLARY 5.2. The factor groups Ba/PBasr1 are infinite direct products, for 
a > 1, of elementary abelian groups of order p. 
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A NOTE ON BALANCED INCOMPLETE BLOCK DESIGNS 
D. A. SPROTT 


1. Introduction. A balanced incomplete block design is defined as an 
arrangement of v objects in 5 blocks, each block containing k objects all different, 
so that there are r blocks containing a given object and \ blocks containing any 
two given objects. Such designs have been studied for their combinatorial 
interest, as in (3), and also for their application to statistics, where the objects 
are usually varieties. 

Various methods of construction have been studied by Bose (1), who developed 
two ““Module Theorems” and applied them to form several families of designs. 
It is the purpose of this note to obtain, using Bose’s first Module Theorem, some 
more general series of designs. 


2. Series A 


THEOREM 2.1. Jf v = mk + 1 = *, where p is prime, then the design with 
parameters 


v=mk+1, b = m(mk +1), r = mk, kk, N= R-1 
can be constructed via the initial blocks 


i i+m i+ 2m 1+(k—1)m 
os «8 ew macll ) 


where x is a primitive element of GF(v) and i ranges from 0 tom — 1. 
Proof. Ali differences are expressible in the form 
ge ttirtedm them oo ttm (9 1) ae get item 
where 
x"—l=x (s=1,2,...,k—1;7=20,1,..., &R—1). 


Also, such expressions run over all possible differences. The number of such 
differences, for s fixed is mk. Further, they are all distinct; for otherwise, for 
i ~i',r #1’, we have 


i+rm=i'+r'm (mod mk), 
i— i’ =m/(r— 7’) (mod mk), 
i—27=0 (mod m). 


Hence i = 7’, since 7 and 7’ are less than m; so 
r—r=0 (mod k), 


and therefore r = r’, since r and 7’ are less than k. This contradiction shows that, 
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for s fixed, the differences range once over GF(v); hence as s ranges over its 
\ = k — 1 values, the differences are symmetrically repeated, each occurring 
\ times. Thus, by the first Module Theorem, the design can be formed by adding 
the elements of GF(v) to the initial blocks. 

This Series A includes Bose’s series a: (k = 4), a, (k = 5) (2), and part of 


E, (k = 3) (1). 
3. Series B 


THEOREM 3.1. If v = 2m(2d + 1) + 1 = #*, where p is prime, then the design 
with parameters 


v = 2m(2A + 1) + 1, 6 = mv, r = m(2A+ 1), R= 2A4+1, A 
can be constructed via the initial blocks 


*, SOS, OO, OO 
where x is a primitive element of GF(v) and i ranges from 0 to m — 1. 


Proof. Here the differences are expressible in the form 
gto 
where 
sa x — 1 (s = 1,2,...,2A; 7 =0,1,..., 2a). 
Since x is a primitive element, we have 


ximatm _ } = 0, oman td 


0, 
x) MOD) 4 x4 1 = 0, 


Hence 
x” _ (<™ —_ Dba + 2m(s—2) + a +4 1) 
a (x*™™ a 1)(x*™ + to + a) + x*™*) 


-_ (x*™™ a 1) geet ims fo + arf 4 1) 


= ” tei maine = 1). 
Thus, if we set a = 2A — s + 1, we have 
x = gee tah—-e—tns = gto tem (h— 2041) 
For s fixed, the differences 
git t H2rm and gtet tire -_ gett Hm 80t2r+D) 


range together over GF(v) once. For, if not, there exist i, 7’, 7,7’, such that 
i = 1',r =f’ do not hold simultaneously and 


t+ m(2X — 2s + 2r +1) = 7 + 2r'm (mod 4m + 2m), 
t—@ =m(2r’ — 2+ 2s — 2r — 1) (mod 4m\ + 2m), 
=0 (mod m), 


i—?7 


1 ¥ 


t. 





, 
), 
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Thus 

2r’ — 21+ 2s —2r-—1=0 (mod 4A + 2), 
which is impossible. Hence the differences are all distinct; since there are 
m(2 + 1) of them, they range once over GF(v). As s varies over 1, 2,..., A, 


the differences will range \ times over GF(v). Hence, by the first Module Theorem 


the design can be constructed by adding the elements of GF(v) to the initial 
blocks. 


Series B includes Bose’s series S; (m = 1) (1), a3 (A = 2) (2), and part of 
T: (A = 1) (1). 


4. Series C 
THEOREM 4.1. If v = 2m(2\ — 1) + 1 = p*, where p is prime, then the design 
with parameters 
v = 2m(2A — 1) + 1, 6 = mv, r = 2md, k = 2X, A 
can be constructed via the initial blocks 
(0, x‘, xitem aiataie xtttO—™) | 
where x is a primitive element of GF(v) andi = 0,1,...,m — 1. 


Proof. The differences not involving the zero element are just the differences 
which arise from the blocks of Series B with A replaced by A — 1; such 
differences are symmetrically repeated and each occurs \ — 1 times. The 
differences involving the zero element are 


+ x*, + git ah ones + x tt4Q—D™ 
Since 


these can be written as 
x', atten xtte si eeia xtt1O-bm 
xe ttm(2A—1) | ge ttm(A+1) | ite x ttm(2a—3) | 


These differences are 2m(2\ — 1) in number and are all distinct; hence they 
cover GF(v) once. Thus each difference occurs \ times in all, and the design can 
be formed by the first Module Theorem. 


This series includes Bose’s a; (A = 2) (2). For m = 1, one obtains the sym- 
metric series 


5. Series D 


THEOREM 5.1. If v = 4m(4X + 1) + 1 = p*, where p is prime, and if among 
the 2X expressions 


= x" (s = 1,2,..., 2A) 


there are d even and d odd powers of x, where x is a primitive element of GF(v), 
then the design with parameters 


4 
M 
‘ 
! 
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v = 4m(4. + 1) + 1, db = mv, r = m(4A4+ 1), R= 4A4+1,A 
can be constructed via the initial blocks 


(x?*, x2ttim 2 t+8m 21+ 16m) 


where i ranges from 0 to m — 1. 


Proof. In a manner similar to that used in Theorem 3.1, it can be shown 
that, if we setc = 4A —s +1, 


q @e+2m(4A—2 8+1) 
x“=%x : 


Further, the differences are 


2i+4rm+¢ 2i+2m(27r+-4A—28+1)+¢ 
a “+ & " 


where s ranges from 1 to 2A and r ranges from 0 to 44. By the method used in 
Theorem 3.1, it can be shown that, for a fixed s, these differences are all distinct 
and are 2m(4\ + 1) in number. Hence they range over half of the non-zero 
elements of GF(v). Consider now the differences 


2i+4rm+¢r 24+ 2m(27r+4A—28+1)+¢:% 
x » & ’ 


where q, is even or odd according as qg, is odd or even. For ¢ fixed, these differences 
range over the other half of the field GF(v). For, if not, there exist 7, 7’, r, 7’, 
such that one of the relations (1), (2), (3), (4), holds. 


(1) 2i + 4rm + q, = 2i’ + 4r’'m+q, (mod 4m(4d + 1)) 
(2) 2i + 4rm + gq, = 2’ + 2m(2r’ + 4A — 284+ 1) +4: 
(mod 4m(4 + 1)) 
(3) 24 + 2m(2r + 4A — 2s+ 1) + 9, = 27’ + 4r’'m+q, (mod 4m(4\ + 1)) 
(4) 21 + 2m(2r + 44 — 2s + 1) + 9, = 27’ + Q2m(2r + 4A — 284+ 1) 4+ 4; 
(mod 4m(4X + 1)). 


Consider the relation (1), for example. If (1) holds, then 
2 —7)+4(r —r')m+q,-—¢:=90 (mod 4m(4.+ 1), 


which is impossible, since g, — ¢, is odd. Similarly, it can be shown that relations 
(2), (3), and (4) are impossible. 

Hence, if there exist g, even and gq, odd, the differences involving this g, and 
this g, range once over GF(v); as s and ¢ vary, the differences will, under the 
condition in the theorem, range \ times over GF(v). Thus the design can be 
constructed. 

For \ = 1, this theorem gives Bose’s series G; (1); in this case, the condition 
in the theorem simplifies to the requirement that x*" + 1 be an odd power 
of x. 


THEOREM 5.2. If the condition on the exponents q, in Theorem 5.1 is violated 
by one primitive element of GF(v), then it is violated by all primitive elements of 
GF(v). 
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Proof. x = x® implies 
x ama =n x”, 
that is, 
4ma = 4mb (mod 4m(4\ + 1)), 
a= (mod 4A + 1). 
Hence, in the exponents qg,, the s may be considered as reduced modulo 4, + 1. 
Let x be replaced by another primitive element y = x‘ where ¢ is relatively 
prime to 4m(4\ + 1). Then x is replaced by 


y™ - yi” _ 1 - PF seheed ha 1 _ x** 

where w* = wi. 
Consider these expressions x**; the set of elements w* can be divided into 
elements r* and elements s*, where 1 < r* < 24, 2A < s* < 4X. This subdivision 
determines a subdivision of the set of elements w into elements r and elements 


s where ir = r* and ts = s*. It is clear that every r* is a w as well as a w*. 
All the w*’s are different; for if 


w* = w;* (mod 4A + 1), 
then 

tw = tw, (mod 4 + 1), 
that is, 

w= Ww (mod 4A + 1). 


This is impossible since the elements w are all distinct; hence the r*’s are all 
distinct and the s*’s are all distinct. 
Define now s** by the equation 


s*¥+s**=4,1+12=0 (mod 4\ + 1). 
Then 
Qs* = Jar41-(441-8*) = J4r41-5** 


= g,#e + 2m(4r + 1 — 2s). 


Also, since 2\ + 1 < s* < 4X, then 1 < s** < 2X. Hence the s**’s are a subset 
of the w’s; further, they are all distinct. 

It can also be shown that the s**’s are all different from the r*’s; for if 
s** = r* (mod 4A + 1), then 


4+1-s*=r (mod 4A + 1), 
s*+r*=0O (mod 4A + 1), 
i(is+r)=0 (mod 44 + 1), 
(s+r)=0 (mod 4A + 1). 


This is impossible since s and r are at most 2A and s + 0. Hence the set of w’s 
has been replaced by sets of elements r* and s** which are disjoint and have no 
repeated members, that is, 


{r*} + {s**} = {fw} = 1,2,3,..., Dr. 


Thus any g, is replaced either by another g,, or by a g, plus an even multiple 
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of m, and no w is repeated. So there will be as many odd and even powers of x 
occurring as occurred originally. 


This theorem shows that the “power” condition on Series D need only be 
checked for one primitive element. 
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A SHORT PROOF OF THE FACTOR THEOREM 
FOR FINITE GRAPHS 


W. T. TUTTE 


We define a graph as a set V of objects called vertices together with a set E of 
objects called edges, the two sets having no common element. With each edge 
there are associated just two vertices, called its ends. We say that an edge 
joins its ends. Two vertices may be joined by more than one edge. 

A subgraph G’ of a graph G is a graph whose edges and vertices are edges 
and vertices respectively of G and in which each edge has the same ends as in 
G. If S is any set of vertices of G we denote by Gs the subgraph of G whose 
vertices are the vertices of G not in S and whose edges are the edges of G not 
having an element of S as an end. 

A graph is finite if V and E are both finite and infinite otherwise. In this paper 
we consider only finite graphs. 

Suppose given a finite graph G. For a € V and A € E we write e(A, a) = 1 
if a is an end of A and e(A, a) = 0 otherwise. Let f be a function which asso- 
ciates with each vertex a of G a unique positive integer f(a). We say that G is 


f-soluble for a given f if to each A € E we can assign a non-negative integer 
g(A) such that 


(1) 2X e(A, a) g(A) = f(a) 


for eacha € V. If Eis null but V is not null, we consider that G is not f-soluble 
for any f. We ignore the case in which V and E are both null (when G is the 
null graph). 

It may be possible to solve (1) so that g(A) = 0 or 1 for each A. Then we 
call the subgraph of G whose vertices are the vertices of G and whose edges are 
those edges A of G for which g(A) = 1 an f-factor of G. Thus an f-factor of G 
is a subgraph of G such that each a € V is a vertex of the subgraph and an end 
of just f(a) edges of the subgraph. 

If n is any positive integer we define an n-factor of G as an f-factor such that 
f(a) = n for each a. 

Necessary and sufficient conditions are known for f-solubility, for the 
existence of an f-factor and for the existence of a 1-factor. We state these as 
Theorems A, B, and C after a few preliminary definitions. 

The degree d(a) of a vertex a of G is the number of edges of G having a as an 
end. If SC Vandaé€ V — S we denote the degree of a in Gs by ds(a). 

Suppose S C V. We write a(S) for the number of vertices of S. The graph 
Gs is uniquely decomposable into disjoint connected parts which we call 
components. (Hassler Whitney (6) uses the term connected pieces, and Kénig 
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(2) susammenhdngende Bestandteile.) We write h,(S) for the number of com- 
ponents of Gs for which the number of vertices is odd. We write 7(S) for the 
set of vertices of V — S which are joined only to vertices of S. 


We denote by g(S) the number of components C of Gs for which there is 
more than one vertex and 


(2) > fe) = ] (mod 2). 


Here we write a € C to denote that a is a vertex of C. 

Now suppose T C V — S. If C is a component of Gsyr we denote by o(C) 
the number of edges of G having one end a vertex of C and the other an element 
of T. We denote by ¢g(S, T) the number of components C of Gsyr such that 


(3) v(C) + > f(a) =1 (mod 2). 


THEOREM A. G is without a 1-factor if and only if there is a subset S of V such 
that 


(4) h,(S) > a(S). 
THEOREM B. G is not f-soluble if and only if there is a subset S of V such that 
(5) 2d f(a) < g(S) + 2d f(c). 
aeS ce ) 


THEOREM C. G is without an f-factor if and only if there is a subset S of V 
and a subset T of V — S such that 


(6) > f(a) < g(S,T) + p>} (f(c) — ds(c)). 


A short proof of Theorem A has been given by the author (4). Maunsell (3) 
has improved it by substituting a piece of elementary graph theory for an 
appeal to the theory of determinants. Theorem B is readily deducible from 
Theorem C; details are given in (5). However, proofs of Theorem C, even in 
the special case dealing with n-factors, have hitherto been long and complicated 
(1; 5). In this paper we present a comparatively short argument whereby 
Theorem C is deduced as a consequence of Theorem A. 


Deduction of Theorem C from Theorem A. Suppose first that G has a 
vertex a such that d(a) < f(a). Then G can have no f-factor. Moreover (6) is 
satisfied with S = 0 and T = {a}. Thus Theorem C is trivially true in this 
case. 

In the remaining case we have d(a) > f(a) for each a€ V. We write 
s(a) = d(a) — f(a). 

Given any sufficiently large set Q we define a graph G’ whose vertices are 
elements of Q in the following way. With each c € V we associate d(c) distinct 
elements c, of Q, one for each edge A of G such that e(A, c) = 1, and s(c) other 
distinct elements c(1), c(2),..., ¢(s(c)) of Q. We denote the sets of the d(c) 
elements c, and the s(c) elements c(z) by X(c) and Y(c) respectively. We 
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postulate that the two sets X(c) LU Y(c) defined for two distinct elements c 
of V shall have no common element. The set V’ of vertices of G’ is given by 


(7) V= U (XUV (0)). 


cev 


For any edge A of G, with ends x and y say, we postulate that G’ has just 
one edge joining x, and y,. We denote this also by the symbol A. We further 
postulate that for each c € V each element of X(c) is joined to each member of 
Y(c) by just one edge of G’, and that G’ has no edges other than those required 
by these two rules. 

For each c € V, the elements of X(c) U Y(c) and the edges of G’ joining 
them constitute a subgraph, St(c), of G’, which we call the sfar-graph of c in G’. 
St(c) is connected if s(c) > 0, and in the case s(c) = 0 only if d(c) = f(c) = 1. 
The diagram shows a star-graph St(c) for the case d(c) = 4 and f(c) = 2. 
(The edges A B C and D in this diagram do not belong to St(c).) 











Lema. G has an f-factor if and only if G’ has a 1-factor. 


Proof. \f G has an f-factor let F be its set of edges and let F’ be the set of 
edges of G’ denoted by the same letters. For each c € V we adjoin to F’ exactly 
s(c) edges joining the s(c) vertices of Y(c) to the s(c) vertices of X(c) which 
are not ends of edges of F’. By the definition of G’ we can do this without intro- 
ducing into F’ two edges with a common end. We thus construct a 1-factor of 
G’. 

Conversely suppose G’ has a 1-factor whose set of edges is H. Let Hy be the 
set of edges of H whose two ends are vertices of distinct star-graphs St(c). 
For each c € V just s(c) elements of H have an end in Y(c) and therefore just 
d(c) — s(c) = f(c) elements of Hy have an end in X(c). It follows that the 
edges of G corresponding to the members of Ho define an f-factor of G. 
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A subset W of V’ will be called simple if it satisfies the following conditions 

for each a € V: 
(i) If X(a) (\ W # O then X(a) C W, 

(ii) If Ya) (\ W ¥ O then Y(a) C W, 

(iii) At most one of X (a) and Y(qa) is a subset of W. 

Condition (iii) implies that X (a) cannot be a subset of W when Y(a) is the 
null set, i.e., when d(a) = f(a). 

Consider any simple subset W of V’. We write S and T for the sets of vertices 
c of G such that X(c) C Wand Y(c) C W respectively. The sets S and T are 
disjoint. We have 


(8) a(W) = 2d (ac) — f(c)) + 2d d(a). 


Let H be any component of G’ ». 

It may happen that H has just one vertex, which is of the form c,. Then 
c € T and the end of A in G other than c belongs to S. The number of such 
components H is the number of edges A of G having one end in S and the other 
in T, that is 

2, (d(c) — ds(c)). 

Another possibility is that H has just one vertex, which is of the form c(i). 

The number of such components is 
p> (d(a) — f(a)). 

In the remaining case, H has at least one edge. If H has no edge in common 
with one of the star-graphs St(a) it must consist of a single edge with its two 
ends. Then the number of vertices of H is even. If H has an edge in common 
with St(a) then Y(a) ¥ 0 and so St(a) is connected. Moreover St(a) is then a 
subgraph of H. A component of G’» having a connected star-graph St(a) 
with at least one edge as a subgraph will be called large. 

Suppose H is large. Let M be the set of all vertices a of G such that St(a) is 
a connected subgraph of H with at least one edge. Then M C V — (SUT). 
H is made up of these star-graphs St(a), a set N of edges which link them to 
form a connected graph Hy and a set P of edges having one end a vertex of Hy 
and one end a vertex c, such that c € T. Clearly M is the set of vertices of a 
component K(H) of Gsyr. We may think of K(H) as derived from Hy by 
shrinking each of the star-graphs St(a), a € M, to a single vertex. Conversely 
suppose K is any component of Gyr. If c is a vertex of K then Y(c) ¥ 0 since 
c¢T and therefore St(c) is connected and has at least one edge. This star- 
graph is a subgraph of a large component H of G’» and we must have 
K = K(A). 

For a large component H of G’» having just vertices 


dX {d(a) + (d(a) — f(a))} + 0(K(A)) 


aeK (#1) 


> f(a) + o(K(A)) (mod 2). 


atK(H) 


n 
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Hence the number of large components of G’ » for which the number of vertices 
is odd is q(S, T). 


Using (8) we obtain the formulae 
(9) hy(W) = q(S, T) + D> (d(a) — f(a)) + p> (d(c) — ds(c)), 
| (10)  w(W) —a(W) 
= 4(S,T) - 2» f(a) - > (f(c) — ds(c)). 


The quantities on the left in these equations are defined in terms of G’, 
those on the right in terms of G. 


Suppose there are disjoint subsets S and T of V satisfying (6). Select two 


—_ 


—_ 


such subsets so that a(S) has the least possible value. Assume that f(b) = d(b) 
for some 6 € S. If we replace S by S — {b} and T by T U {bd} inequality (6) 
, will remain valid, for with at most d(b) exceptions the numbers »(C) associated 
f with the components of Gsyr are unaltered. This contradicts the definition of 
| S. Hence f(b) < d(b) for each 6€ S. Let W be the union of the sets X(a) 


such that a € S and the sets Y(c) such that c€ T. Then W is simple since 
Y(c) is non-null when X(c) C W. It follows from (10) that 4,(W) > a(W) 
| in G’. Hence G’ has no 1-factor, by Theorem A. Hence G has no f-factor, by 
: the Lemma. 
' Conversely suppose G has no f-factor. Then by Theorem A and the Lemma 
there is a set W of vertices of G’ such that h,(W) > a(W). Choose such a W 
so that a(W) has the least possible value. 
Suppose there exists a € V such that Y(a) (\ W # Oand Y(a) (\ (V’ — W) 
~ 0. Write Z = W — (Y(a) (\ W). Then G’,» and G’;z differ in one component 
} only, provided that X (a) is not a subset of W, since the members of Y(a) are 
all joined to the same vertices of G’. If X(a) C W then each component of G’ » 
is a component of G’z. In either case we have h,(Z) > h,(W) — 1 and 
a(Z) < a(W) — 1. Hence hy(Z) — a(Z) > h,(W) — a(W), contrary to the 
definition of W. We deduce that Y(a) C W if Y(a) (\\ W ¥ 0. 
Suppose next that X(a) (\ W # 0. Choose b€ X(a) (\ W. Write Z = W 
) — {b}. There is at most one component of G’ » which has a vertex not a mem- 
ber of Y(a) joined to } in G’. Hence if Y(a) is contained in W the numbers 
h.(Z) and h,(W) can differ by at most one. Then 4,(Z) > h,(W) — 1, 
a(Z) = a(W) — 1 and therefore h,(Z) — a(Z) > h,(W) — a(W). This con- 
tradicts the definition of W. We deduce that, for the case X(a) (\ W ¥ 0, 
Y(a) is not a subset of W and therefore Y(a) (\ W = 0 by the result of the 
preceding paragraph. This proves that X(a) and Y(a) cannot both be subsets 
of W, since X (a) is never null. (d(a) > f(a) > 0.) 
Suppose both X(a) (\ W and X(a) (\ (V’ — W) are non-null. We choose 
b€ X(a) (\ W and write Z = W — {bd} as before. Since Y(a) (\ W = 0 all 
the vertices of Y(a) belong to one component of G’», for each is joined in G’ 
to each vertex of X(a) (\ (V’ — W). But there is at most one component of 
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G’ w which has a vertex not a member of Y(a) joined to } in G’. Hence with at 
most two exceptions the components of G’ are components of G’z. Accord- 
ingly 
h,(Z) > h,(W) — 2, 
h,(Z) — a(Z) > h,(W) — a(W) — 1. 


But h,(Z) is by definition the number of components of G’z having an odd 
number of vertices. Hence 

h,(Z) + a(Z) = a(V’) (mod 2) 
and similarly 

h,(W) + a(W) = a(V’) (mod 2). 


We may write these results as 
h,(Z) — a(Z) = a(V’) =h4,(W) — a(W) (mod 2). 


Hence h,(Z) — a(Z) > h,(W) — a(W) and so the definition of W is contra- 
dicted. We deduce that X(a) C W if X(a) (\ W #0. 

We have now proved that W is simple. We define S and T in terms of W as 
before. Using (10) we find that S and T satisfy (6). 

This completes the proof of Theorem C. 
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AN ELEMENTARY PROOF OF A THEOREM ABOUT 
THE REPRESENTATION OF PRIMES BY 
QUADRATIC FORMS 


W. E. BRIGGS 


1. Introduction. The theorem that every properly primitive binary quadratic 
form is capable of representing infinitely many prime numbers was first proved 
completely by H. Weber (5). The purpose of this paper is to give an elementary 
proof of the case where the form is ax* + 2bxy + cy*, witha > 0, (a, 2b, c) = 1, 
and D = b? — ac not a square. The cases where the form is ax* + bxy + cy’ 
with 5 odd, and the case where the form is ax? + 2bxy + cy’ with Da square, 
can be settled very simply once the first case is taken care of, and this is done 
in a page and a half in the Weber paper. The proof follows the methods used 
by Atle Selberg in his elementary proof of Dirichlet’s theorem about primes 
in an arithmetic progression (3). 


2. Representation of numbers by quadratic forms. Some basic facts con- 
cerning the representation of numbers by binary quadratic forms are now given. 
The h classes of properly primitive quadratic forms of determinant D can be 
taken as 6;, 02, . . . @,, and in the case of negative determinants only the classes 
which contain positive definite forms are considered. These classes considered 
as elements form an Abelian group of order h under Gauss’s law of composition. 

A positive number m, relatively prime to 2D, is primitively representable by 
forms ot determinant D, if and only if D is a quadratic residue of m, and to 
each root n of the congruence x? = D (mod m) correspond one or more repre- 
sentations of m by each form of the class to which mx*+-2nxy+-[(n?—D)/mly* 
belongs. If p is a prime which does not divide 2D and of which D is a quadratic 
residue, then to each of the two roots of x? = D (mod ) correspond one or 
more representations of p by the forms of one or two classes. These two classes 
are conjugate and can be indicated as @ and 6@-". If the classes are identical, then 
6 is called ambiguous. The number of representations of such an m by each 
form of the class to which mx? + 2nxy + [(n? — D)/m]y? belongs is equal to 
the number of integral solutions of #@ — Du? = 1. In order that this number 
not be infinite when D > 0, it is required that the x and y in ax* + 2bxy + cy’ 
satisfy the conditions 





(2.1) y>0, x> AT y = vy, D> 0, 
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where T, U is the fundamental solution of the Pell equation. In this way there 
are now w representations of m by each form of the class where 


w= a D > 0, 
(2.2) w= 4 D — on 1, 
wt 2, D < =e 1 
Defining 
(2.3) Sy(x) = ze and Q,(x) = —rr 


where the summation is extended over primes represented by y = 
ax* + 2bxy + cy*®, the proof will be completed by showing that Q,(x) i 
greater than a positive constant for x > x» for any y. 


3. Several preliminary lemmas. 


LEMMA 1. The number of lattice points N(T), subject to restriction (2.1) if 
D > 0, within ax? + 2bxy + cy? = T which make the form prime to 2D is 


se?) 








N(T) = BT + O(V/ T) 
where 
B= * D<0 
V—D' 
p= spe (T + UD), D> 0. 


Proof. Each |2D| by |2D| square built up over the plane from the origin 
contains |2D|¢(2D) lattice points which make the form prime to |2D) (1, 
pp. 235-6). Let the number of these squares lying entirely within the appropri- 
ate area be N’(T). Then |4D2N’(T) = Area| is less than the area of the 
squares which are cut by the perimeter, which is of the order of the length of 
the perimeter. The perimeter is O(./7) and the area is 87, and the result 
follows. 


LEMMA 2. For any D not a square 


mt = }logx + O(1). 
Wipr=t 
(p,2D)=1 


This was proved by Selberg in his elementary proof of the prime-number 
’ theorem for arithmetic progressions (4). 


LEMMA 3. 


> MEP — wlog x + 0(1). 


pz 
(p,2D)=1 





— CO°*=*L 





QUADRATIC FORMS REPRESENTING PRIMES 355 


Here >-’ means a summation over all representations, subject to restriction 
(2.1) if D > 0, by a representative system of one form from each properly 
primitive class of determinant D, and where w has the meaning (2.2). 


Proof. This follows from Lemma 2 since each prime p with (p, 2D) = 1 and 
(D\|p) = 1 has 2w representations in all by the classes 6, and 6,-'. If 6, is an 
ambiguous class, then there are 2w representations by the single class #¢, = 6,~'. 





LEMMA 4. 
l r+1 1 1 +l, ‘ 
nr t tee’. 
(2b)=1 
(Di p)=1 


Proof. This follows from Lemma 2 by partial summation. 


LEMMA 5. 


, 


log’**p bas log’**x 
» p — r+i1 


(p,2D)=1 


Proof. This follows from Lemma 4 and tk proof of Lemma 3. 





+ O(log’x). 


4. Proof of the theorem. Next consider 


2 te 


n@r "= 
n= 
(n,2D)=1 


where w, = @,;.: = Da Ne = Age = w(d) log? 5, ¥ is a properly primitive 


din 
form of determinant D, and nm = y means 7 is represented by W and that each 
representation is counted with the usual restriction (2.1). But 


log*x n=l, 
. log p log (x*/p), n=", a>l, 
2 log p log g, n= p'¢',a68 >1, 
0 for all other n, 


where p and g denote prime numbers. 
Therefore, where p and g do not divide 2D, 


DS on = D log plog (x*/p) + DY log p log g + O(log*x) 


<< y%<z pics 
ate rnd em 
= >> (2 log x log p — log*p) + >) log p log (x’/p) 
paz pg <z 
p= 
a>l 
+ D log plogg+ > log p log g + O(log’x). 
mess pP<z 
vend at de 


af>1 
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From this follows, using the arguments of Selberg (3) with appropriate 
changes in the indices of summation, that 


(4.1) DL om = DL logp+ DL log plogg + O(x). 
ont on pet 
(n,2D)=1 (py. 2D)=1 (pq. 2D)=1 


On the other hand, 


-— on Fee & 


n<z d<z din 

n= (d,2D)=1 n<r 
(n,2D)=1 

The second sum on the right is the number of multiples of d which are rela- 
tively prime to 2D, less than or equal to x, and represented by y. This is RS 
where R is the number of representations of d by the forms of determinant D, 
and S is the number of numbers relatively prime to 2D and less than or equal 
to x/d which are represented by @,~'y if 0, represents d. From (2, p. 144) 


R= w)) (Djs) 
bid 
and from Lemma 1, 
_ 9([2D}) ,x po 


since 8 does not depend on any particular class but only on D. Therefore 


n f x =a} g = $22) 
RS = {w> wpa gt OV x/d)¢, B= “on, 
and 
w= wes 5 YE H+ Veo F My wp). 
nqr d<z bid d<qr bla 
a, — (4, 2D)=1 (d,2D)=1 


Next the error term is estimated. Let 
lAa! 
- = 
ata d 
where R, is the number of representations of d by classes of forms of determin- 


ant D. By Lemma 1, N(T) is independent of the class and since R, is 
N(d) — N(d — 1) summed over the hk classes, it follows that 


IE| < n> XO — Nit — 1) 


2 Vt 
A simple calculation shows that E = O(+/x), and therefore 


(4.2) Don = wh'x » ns D> (D\5) + O(x), 


nar 


E == Ra 





log” r + O(log’x). 


n= (d,2D)=1 
(n, 2D)=1 


a 


~~ 








sy? 


we? 


w 
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where §’ and w depend only on D, so that (4.2) holds for a y in any properly 
primitive class. 


Comparing (4.1) and (4.2) and summing over the ’ properly primitive 
classes, there results 


(4.3) , 
1 
> log’p + DO 1g pog¢ = 3 > log’p + _ log p log g\ + O(x) . 
op ae | \elne (we. tbat 
(p, 2D)=—1 (pq, 2D)=—1 
By partial summation one gets 
log*p log plogg 1 log*p ’ log p log g 
» -* Tees 7 -1{ iy. * » pq | 
p=v pa=wv & 2D)= (pq. 2D)=1 | 
(p, 2D) =1 (p¢,2D)=—1 


+ O(log x). 


Since each pg in the summation on the right above appears 4w times, it 
follows that 


pe<z Pq paz «CP ogczip =F 
(pq.2D)=1 (p.2D)=1 (q,2D)—1 
(D|p)=1 (D\g=1 


This is easily evaluated by using Lemmas 2 and 4 giving w/2 log*x + O(log x). 
Therefore from this and Lemma 5, 


paz P pe<z Pq 


p= pqa=v 
(p,2D)=1 (pq, 2D)=1 


= * log*x + O(log x). 


By partial summation from (4.3) results 


log*p log p log g 
> Se+ ¥ log pq 
PAI p Pq 


past 
p= pq=w 
(p, 2D)=1 (p¢q.2D)=1 
| 
“1 > =e. > ne IED tog bg + O(log*x). 
| pz p pe<r 
(p,2D)=1 (p¢q.2D)=—1 
’ log’ 1 a log’ 
a“ EE IED log pq = > 108 p 108d > Slee 
pe<z pq<z Pq pe<z Pq 
ws. $D)=1 (pq, 2D)=1 (pq. 2D)—1 


But each of the two symmetric terms on the right above can be written as 


2 
> log’p log q 
par p e<zip 
(yp, 2D) =1 (¢.2D)—1 
(D |p)=1 (D\q)—1 


and by Lemma 4 this equals jw log* x + O(log* x). Using this and Lemma 5 
results in 
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3 
(4.5) > ee + a” aE ES tog pq = 22 log'x + O(log"). 
pat 


p P@<r 
pqa=¥ 
(p, 2D)=1 (pq. 2D)=1 
Next, 
log p log”g log p{ 5 log’g log*g| 
) > - 2, = 2 +2, =. 
past Pq paz p @<z/p qd @<z/p q 
p= (p,2D)=1 I=» =F >~* 
(pq. 2D)=1 (D |p) =1 (¢,2D)—1 (¢,2D)=1 


where is represented by y¥, and y,~' and y,), = ¥,~'y,~' = y. The above 
expression is equal to 


paz =O h p er<zip qr 
(p,2D)=—1 ar=—v> 
(D |p)=1 (r=oy—* 
(er, 2D)=—1 


by (4.4) (which holds for a ¥ from any class), and where r denotes a prime 
number. Expanding and simplifying by Lemma 4 gives the last expression 
equal to 





: log p log g log r 2. 
3h log’x >> oa + O(log*x). 
para 
(per, 2D)=1 


Therefore from (4.5), 


3 
(44.6) Yo MBL. op eblogglogr | O42.) 





PSI p parsz pqr 
p= par 
(yp, 2D)=1 (per,2D)=1 


From (4.4), we have 


(4.7) S MEP < Pie's + Oleg), 
p<r Pp h 


p= 
(p,2D)=1 





which yields by partial summation 


(4.8) } log p <2 — 7 log x + O(log log x). 
PSz p 
p=¥ 
(p,2D)=1 
Next, 
Smet Fy ets Fey as 
pq<z Pq p<z/* gczt/s Pq al<pez P ez 
Pe Pe (p,2D)=1 @=¥> 
(pq. 2D)=1 (pq, 2D)=1 (D \p)=1 =v>- 
(¢.2D)—1 


Applying (4.8) and Lemma 4, we find 
y et < FE ts 


pq<r Pq p<zi/* geri/s pq 
p= pa=v 
(pq. 2D)=1 (pq, 2D)=1 


5 lo og’x + O(log x log log x). 


—— 





ae 


~ 
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Therefore from (4.4), 


2 
> EL > FF log's — > SEE IE + O(log x log log x); 


par pSr'/* egzt/* 
p= Pe 
(p,2D)=1 (pq, 2D)=1 


that is, for x > Xo, 


logs BL > F tog's— SRL lone 


<= —" p q 
Pp Pp P 
(py, 2D) =1 q=¥ 

(p, 2D) =1 

(¢.2D)—1 


where the latter sum is taken over primes p and g with ¥,¥, = y. Recalling 
(2.3), this can be written as 


log x Sy(x) > = log*x — p> So(x*) So(x!), 
10h i, 


where the sum is taken over all pairs of classes 6, @’ such that w belongs to the 
class 66’. 


Division of both sides by log? x = 9 log? x! yields 
(4.9) Q(x) > p> Qo(x!) Qe-(x*), x > Xo. 
108 ~ Ory 
By (4.6) 
log*x Q(x) >2 > log p log q log r + O(log*x) 


poraz'/* pqr 
r= 


P@ 
(per. 2D)=1 


a>? 5 loda 5 elit 5 lnel 


27 00’ 0’ =p ~— oa 7 p | (ae eae q 
(p. mt (q'2D)=1 
= log r (-1-) 
x li x’? 2, r | + O log x 
(raDp)=t 
or 
] 

j aie 

(4.10) Qy (x) >= 2 * Qo(x*) Qe (x!) Qe (xt) + o(4 -). 
Then (4.8) gives 
2w log log x =) 

(4.11) Qy (x) <3 # +0( orp Z 


At this point the characters of classes of forms of determinant D are intro- 
duced where the character x of a class 0, x(0), is obtained as an Abelian group 
character from the group which the classes form under composition. In general 
they are divided into three categories: The principal character xo, with 
xo(6) = 1 for all 6, real non-principal characters, with x(@) = +1 for all @ 
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and which exist if and only if # is even since each character is an h-th root of 
unity, and the non-real characters. These characters are needed only when h(D) 
is even. 


LEMMA 6. If h is even and x is any real non-principal character, then 


logp 1 
—-—{ = — log x + O(1). 
ae Tetem 
x(0,)—1 
(p,2D)=1 
Here the summation is extended over all primes which are representable by 


forms of classes of determinant D and for which classes x(@) = 1. 


Proof. By considering all possible products of Gauss’s generic characters, 
one gets that for any real non-principal character, there exists a factor D, of D 
such that x(@) = (D,\m), where m is any number prime to 2D which is repre- 
sentable by forms of the class @ (5, pp. 311-312). Therefore x(@,) = (D,\p) 
and since if p is represented by a form of determinant D, (D|p) = 1, the sum 
takes the form 


, i log p_ W 
9 aDat P 
GF)" 


| PSI p BSI p DSI p | 

(DD, \p)=1 (D, |p)=—1 (D \p)=1 

since each desired prime appears twice in the sums of the right member. Since 
by Lemma 2 


MRR EP EP - ghoge + 001), 
p paz P t< p 


DAI 
(D, |p)=—1 » p)=1 


it follows, again by Lemma 2, that 
W = 4(3 — 3 + 4) logx + O(1) = jlogx + O(1). 


LEMMA 7. Suppose h(D) is even and there is a set of different classes of properly 
primitive forms of determinant D, 6;, 62,... , 0%, and that k > $h, and that for 
each real character x for forms of determinant D, there is a 0 in the set with 
x(0) = 1. Let y be a properly primitive form of determinant D, and suppose that 
there is a 0 and a 6’, not necessarily different, belonging to the set, such that y 
belongs to the class 00’. Then there is a triple of classes belonging to the set, 0, 0’, 0’, 
such that 0 0 0’ = y under composition. 


Proof. The proof follows from the proof of Selberg’s lemma. (3, Lemma 2) 
by replacing primitive residue classes by properly primitive classes of quad- 
ratic forms. 


———e 





— 
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Lema 8. If h(D) is odd, and there is a set of different classes of properly 
primitive forms of determinant D, 6, 02, . . . , 0, with k > h/2, and ifa@dandaé’, 
not necessarily different, such that 00’ = y, belong to this set, then there exists a 
triple of classes belonging to the set, 0, 0’, 0’, such that 00’ 0” = y. 


The proof is the same as the proof of the preceding lemma for k > 4h. 
Now it can be shown that 


1 
Qy(x) > (130) "x" x > Xo. 

Assume that for some large x 

1 
(4.12) Qy(x) < 130h ° 
By Lemma 3, 

1 
> Qe(x*) = jog 2 [log x + O(1)] = w+0O (4 -), 

But 


log log x) 
Qo(x* ) <2 —r O ( ios 


for all @ by (4.11). 
Therefore there are at least the greatest integer in (hk + 1)/2 classes @ with 


1 
4 — = . 
Qe(x ) > 130k" ' x > Xo. 
From (4.9), 
Qy (x) > Ink ah 7 p> Qo(x*) Qe (x*), x > Xo, 
9 ot 
and (4.12), 
GG) < <= a - 

Therefore 


E One!) Qty >0(-)> 1, 
my 


10h 130h 15h 


Therefore there exists at least one pair of classes 6, 6’ with 00’ = ¥ such that 


Qe(x*) Qe (x*) > be 


or 





: ‘ by (4.11) 


1 
Qo(x*) > 173 z> 
Qe (x"’") 15h (= 4 ‘) bh? 


1 1 
= 30wh + 15h’ > 13082’ oom 














362 W. E. BRIGGS 


and likewise 


j . 
Qe (x ) f = aor? x > Xo. 


5. Completion of proof for h(D) even. By Lemma 6 


) > og? = 5 log x + O(1) >= 5 log %, x > Xe, 
pSr p 
p=, 
x(0y)—1 
or 
5 ine , ls sae 
, paz P 9 4s ; 
x(@)—1 p= 
or 
1 log = 1 : 
p> log x Pe > Qo (x! ) >s x > Xo. 
x(@)=—1 p= eat 


Therefore there exists at least one @ with 


1 
Que!) > a5 > Ta0n? 
and with x(@) = 1 for each real non-principal character x. 
Thus there is a set of different classes 6,, 02, . . . @, with k > h/2, such that 
fori = 1,2,..., 
Qe. (x!) > ro 


and such that for each real character x there is a 6, with x(6,) = 1, and finally 
such that there exist classes 6, 6’, with 00’ = y. Therefore by Lemma 7, there 
exist classes 0, 0’, 0’, belonging to the set with 66’6’" = y. Then by (4.10) 


2 1 1 
Qy(x) > 7 Qo(x!*) Qer(x*) Qe(x?) — O (2) > 130)" 


for x > xo, which completes the proof for h(D) even. 


6. Completion of proof for (D) odd. Again there is a set of different classes 
01, 02,...0,, with k > h/2 such that fori = 1,2,...k, 


Qo, (x') > == aor 


and such that there exist classes 0, #’ in the set with 66’ = y. Therefore by 


Lemma 8 there exists a triple of classes 6, 6’, 0’, belonging to the set with 
00'0’" = y. Then again by (4.10) 


Q(x) > 2 Qo(x*) Qe-(x*) Qe-(x?) — O (.)> > aa 


for x > x» which completes the proof of the theorem. 


| 
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THE SIXTEENTH POWER RESIDUE CHARACTER OF 2 
A. L. WHITEMAN 


1. Introduction. The problem of giving a criterion for the eth power residue 
character of 2 has long interested number theorists. This paper is primarily 
concerned with the cases e = 4, 8 and 16. Gauss (8) proved that 2 is a bi- 
quadratic residue of a prime p of the form 4n + 1 if and only if p is represen- 
table as x? + 64y*. A simple demonstration of this result is due to Dirichlet 
(7). Reuschle (13) stated, but did not prove, the following criterion for the 
octavic character of 2: Let p be a prime of the form 8” + 1. If m is even, then 2 
is an octavic residue of p if and only if p is representable as x? + 256y’; if m is 
odd, then 2 is an octavic residue of p if and only if pisrepresentable asx? + 64y? 
but not as x? + 256y*. The first proof of Reuschle’s criterion was given by 
Western (14). 

Cunningham (4) examined the first 118 primes of which 2 is an octavic 
residue. On the basis of the evidence he conjectured a criterion for the 16th 
power (sextodecimic) residue character of 2 which may be stated as follows: 
Let z denote an odd number. The number 2 is a 16th power residue of a prime 
p of the form 16 + 1 if and only if p is simultaneously representable in the 
forms x* + 1024y? and x? + 128y? or in the forms x? + 2562? and x? + 32z?. 
Aigner (1) rediscovered Cunningham's criterion and gave the first proof. His 
method employs class field theory. Beeger (3) noted that Cunningham’s 
criterion may be deduced from some complicated formulas about the field of 
eighth roots of unity stated without proof by Goldscheider (9). 

In the present paper the theory of cyclotomy (division of the circle) is 
employed to prove the criteria of Gauss, Reuschle and Cunningham. This is 
the method created by Gauss to derive the biquadratic character of 2, and it is 
surprising that it has not previously been used to derive the 8th and 16th 
power residue characters of 2. 

It is natural to ask for a criterion giving the 32nd power residue character 
of 2. Such a criterion is yet to be discovered. It is hoped that the present paper 
constitutes a first step in this direction. 

Before proceeding to the proofs, we find it useful to make some preliminary 
comments about the eth power residues of an odd prime p. Let g denote a 
fixed primitive root of p. For an integer a not divisible by p, the index of a 
(ind a) is defined by the congruence a = g™ * (mod p). Let E denote the highest 
common divisor of e and p — 1. Then the eth power residues (mod ) are 
precisely those numbers whose indices are divisible by E. Consequently the 
number of eth power residues (mod p) is (bp — 1)/E£. If ais an eth power residue, 
the congruence x* = a(mod p) has exactly E solutions. The extension of 

Received March 2, 1953. This investigation was supported by the Office of Naval Research. 
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Euler's criterion for primes p of the form ef + 1 states that an integer a is an 
eth power residue of p if and only if a@®-/* = 1 (mod p). 

Suppose now that the conditions are known under which 2 is an eth power 
residue of p for e = 2, 2?,...,2*-". These include the results for e = 2* 
except when p = 1 (mod 2*). For otherwise, since E = (2*, p — 1), we have 
E = 2', 1 < k, and then the results for e = 2* are included in the results for 
2°. 

2. Cyclotomy. For proofs of the basic formulas in the theory of cyclotomy, 
the reader should consult the treatise of Bachmann (2) or the memoir of 
Dickson (6). 

Let p be an odd prime and e a divisor of p — 1. Let g be a fixed primitive 


root of p and write » — 1 = ef. The cyclotomic number (A, k) is the number 
of values of y, 1 < y < p — 2, for which 


(2.1) yee", l+y = get (mod p), 


where the values of s and ¢ are each selected from the integers 0, 1,...,f — 1. 
Noting that g*‘ = 1 (mod ~), we may infer immediately that the value of 
(h, k) is unchanged if either h or k is augmented by a multiple of e. The symbol 
(h, k) also has the following properties (2, pp. 201-203): 


(2.2) (h,k) = (e —h,k —h); 
| (e, h) (f even), 
_ Ob) =) e+ deh + de) (f odd); 


o-1 f-1 (h = 0,feven orh = e, f odd), 
(2.4) » (h, k) = \ 


(otherwise). 


Let m, n denote integers and put 8 = exp (2i/e). Then we define the Jacobi 
sum (2, p. 122) 


p—1 
(2.5) ¥(8", ) = 2, armsctomac, 
a=0 
where the convention is made that 6™® = 0. Now replace m in (2.5) by on, 
where v is an integer. For a # 0, replace a by —a and put a = g’. Observing 
that B™(-) = gles =*(—1)/, we find that (2.5) becomes 


¥(8", BY) = (—1)™ piri ttehm 


b=) . 


In the last sum, we'collect those exponents of 8 which are in the same residue 
class (mod e). Put 


b=e+h, 0O¢hge-l1l, O<s<f—l. 
For a fixed value of 4, the number of solutions of the congruence 


vh + ind (1 + g****) = i (mod e) 
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is the same as the number of solutions of the congruence 
L+ ge? = gt“ (mod p), O<s,t<f—-1, 


and hence is equal to the cyclotomic number (h, i — vh). The finite Fourier 
series expansion of ¥(8™, 8") is therefore given by 


(2.6) ¥(6", 6") = (-D"Z B(i, v) 6", 
where 
(2.7) Bi, v) = > (h, 1 — vh). 


The sum B(i,v) has been studied by Dickson (5) and by Hurwitz (11). 
From (2.3) and (2.4) it follows that 


1 [f—-1 (i = 0), 
(2.8) BGi,0) = 2 (h, 4) = \f (l<i<e-1). 


We have also the identity 
(2.9) Bii,v) = Bli,e —v — 1). 
To prove (2.9) we employ (2.2). Then we get 


Ata To <hé~b~-Bo TAs &—0- 2a. 


We next let a denote a root of the equation a?-! = 1 and put ¢ = exp (27i/p). 
The Jacobi sum (2.5) is closely related to the Lagrange sum (2, p. 83) defined 
by 

p—1 


(2.10) ra) = Da™ **. 


a=0 


Indeed we have the formula (2, p. 86) 
(2.11) ¥(6", 8") = 7(6™) r(6")/7(8"*"), 


when m + n is not divisible by e. We have also the easily proved formula 


(2, p. 87) 


(2.12) 7(8") r(8") = (—1)"P, 

if m is not divisible by e. Using (2.11) and (2.12) we may deduce at once that 
(2.13) ¥(6", 8") = ¥(6", 8") = (—1)”" ¥(B-™", 6"), 

and (2, p. 123) 

(2.14) ¥(6", 6") ¥(8-", B") = Pp, 


provided no one of m, n, m + n is divisible by e. 

Jacobi (12, p. 167) stated without proof the following property of the 
Lagrange sum (2.10). If the integer m is defined by the congruence g” = 2 
(mod /p), then 


(2.15) t(—1) r(a’?) = a™ r(a) r(—a). 





it 


THE 16-IC RESIDUE CHARACTER OF 2 367 


A proof of (2.15) attributed to H. H. Mitchell is given by Dickson (6, p. 407). 
Another proof appears in the book of Hasse (10, p. 442). 

We shall require the following lemma. 

LemMA. If e = 2*, k > 1 and B(i, v) is defined by (2.7), then 


je—1 


(2.16) > (Bi, v) — Bi + fe, v)) = 4e((0, i) — (0, i + 4e)). 


v=) 


In order to establish (2.16) we use (2.7) and get 


(2.17) = B(i,v) = > ) (h,i — vh) = e (0,7) + > >> (h,i — vh). 


o=0 h=0 h=1 o=0 
Now replace i by i + $e. For a fixed value of h, 1 < h < e — 1, put 
h = 2%,0 ca <k — 1, b odd. 
Since ¢ is a power of 2 and 3 is odd, vb runs over a complete residue system 


(mod e) whenever v does. Hence we obtain 


e-—1 e-—1 e—1 
(2.18) —e(0,i+ fe) + dL Bui + he,v) = yi > (h, i + 2°(2*"-** — vb)) 


—1 e-1 e—-1 el 
=> Dd (hi — 2b) = > Dd (h,i — wh). 
h=1 o=0 h=l p=mO 


Subtracting (2.18) from (2.17) we derive the identity 


DY (BG,») — BG + 4¢,0)) = e((0,4) — 0,4 + 4e)). 


The Lemma now follows at once with the aid of (2.9). 
. 

3. The biquadratic character of 2. We consider the case e = 4 of §2 and 
divide the discussion into two parts. 

(i) f even. The assumption f even is, of course, necessary in order that 2 be 
a biquadratic residue of p. In this case the relations which follow from (2.2) 
and (2.3) may be summarized schematically by means of the matrix 


A BC D 
(3.1) BDEE, 

CECE 

DEE B 


in which the letter in the Ath row and kth column (h, k = 0, 1, 2, 3) represents 
the value of the cyclotomic number (h, k). Applying (2.4) for kh = 0, 1, 2 and 
using (3.1), we get 


(3.2) A+B+C+D=f-1, B+D+2E=f, C+E=}. 


Returning to (2.6) we see that in this case 6 = exp (21/4) = i. Take 
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v = n = Land note that y(7’, 7*) is the complex conjugate of (i, 7). Evidently 
(2.14) implies 


(3.3) ¥(i,4) =~a+bi, a = B(O1) — B(21), 6 = Bil) — B31), 
where p = a? + 5*. Also we have from (2.7) and (3.1) 


(3.4) B(Ol)=A+C+2E, B(il)=2B+2E, B(21)=B+2C+4+D, 
B(31) = 2D + 2E. 


By (3.4) and the first of (3.2) the formulas for a and 6 in (3.3) reduce to 
a = 2A + 2E —f+1 and } = 2B — 2D. Since a is odd and 3 is even it 
follows that 0 is actually divisible by 4. Eliminating B, C and D from the three 
equations in (3.2) we get 2A = 6E — f — 2. Also we obtain immediately 
B+ D = 2C. It follows that 


(3.5) a=8E-2f—1, }}=B-C. 


The second formula in (3.5) may be used to deduce at once the criterion of 
Gauss stated in the introduction. We note that the symbol (h, #) represents 
the number of integers y, 1 < y < p — 2, for which 

y=ge", 1+y = gt (mod p), 0<s,t<f-1. 
For every such y there exists a complementary y, since p — y — 1 = g**+?/+*, 
pb — y = g**t?* (mod p). These two y’s will be distinct unless y = 4(p — 1), 
in which case 2 = g~**+?* (mod p). Conversely, if 2 = g—* (mod p), then 
4(p — 1) = g-™*?™ (mod p) and 4(p + 1) = g-™**™ (mod 9), and hence 
(h, h) will be odd. Since 2 is a quadratic residue of p it follows that B = (33) 
is even; also C = (22) is even if and only if 2 is a biquadratic residue of p. 
Hence 30 is even if and only if 2 is a biquadratic residue of ». Thus we have 
proved the theorem of Gauss (8, p. 89). 


THEOREM 1. Let p = a® + b*, a odd, b even, be a prime of the form 4n + 1. If 
2 is a quadratic residue of p, then 29-/4 = (—1)*4 (mod p). 


The result in Theorem 1 may also be formulated by stating that if 2 is a 
quadratic residue of p, then ind 2 = 3b (mod 4). Clearly the validity of this 
congruence does not depend upon the choice of the sign of b. 

(ii) f odd. In this case 2 is a quadratic nonresidue of p and $b = +1 (mod 4) 
depending upon the choice of the sign of b. Following the lines of the argument 
in the case f even we may readily prove again that ind 2 = $5 (mod 4). This 
result is, however, ambiguous. For 2 is congruent to a number of the form 
g***! or g**+® (mod p) depending upon the choice of the primitive root g. 


4. The octavic character of 2. In this section we again consider the case 
e = 4, f even. Our discussion is based upon the congruence 2 = g/(l — g’)? 
(mod ~) which, in turn, implies 
(4.1) Qe-)/8 = gl?-0/8(] _ gies (mod p). 


In order to determine the biquadratic character of 1 — g’ we proceed as 
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follows. There are exactly }(p — 1) roots of the congruence x-"/* — g/ = 0 
(mod p), and these roots are given by the numbers 6; = g*‘t'! (mod ), 


i= 1,2,...,3(p — 1). Hence we have the factorization 

(4.2) xP-D/4 — gf = (x — B,)(x — B2)... (x — By) (mod p). 
Putting x = —1 in (4.2) we get 

(4.3) 1 — gf = (1 + B,)(1 + 6)... (1 + By) (mod p). 


From (3.1) and the definition of the cyclotomic number (A, &) in (2.1) it is 
clear that (4.3) implies 


(4.4) ind (1 — g’) = (11) + 2(12) + 3(113) = D+E (mod 4). 
By (3.2) and the second of (3.5) we have D+ E = (C+ E) — (B—C) 
=f/2 — b/4. Hence by (4.4), ind (1 — g’) = 4f — 46 (mod 4). Congruence 
(4.1) may now be written in the form 
4 


5) Q(p—1)/8 oa gh 4) 


(mod p). 

At this point we assume that 2 is a biquadratic residue of p. By Theorem 1 
b = 0 (mod 8). We consider two cases: Case 1, f = 0 (mod 4), } = 0 (mod 16) 
or f = 2 (mod 4), } = 8 (mod 16); Case 2, f = 0 (mod 4), b = 8 (mod 16) or 
f = 2 (mod 4), b = 0 (mod 16). Then we find that (4.5) becomes 


' 1 (mod p) (Case 1), 
(P—1)/8 = j 
ane . na t- 1 (mod p) (Case 2). 


The result in (4.6) implies the criterion of Reuschle (13, p. 14) stated in the 
introduction. This criterion is also expressed in 


THEOREM 2. Let 2 be a biquadratic residue of a prime p = a® + 5b’, a odd, 
b even, of the form 8n + 1. If n is even, then 2°-?/8 = (—1)°’ (mod p); if n is 
odd, then 2°-/8 = (—1)/8+! (mod p). 


5. The sextodecimic character of 2. We now consider the case e = 8, 
f even of §2. A summary of the relations which may be derived from (2.2) and 
(2.3) is given in the matrix 


ABCODEFGH 
22 2°84 2a 
CIGMUNONSJ 

(5.1) DJIM*FULOOK 
aeanaes fs 
PEO Oreck DSI Mi 
GMNWNONWSJSCiI 
Serres ars 





where the element in the Ath and kth column (h, k = 0, 1,..., 7) denotes the 
cyclotomic number (h, k). Of course, the letters in the matrix (5.1) represent 
different cyclotomic numbers than the letters in the matrix (3.1). In order to 
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avoid confusion we shall not refer explicitly to letters of matrix (3.1) again. 
Applying (2.4) for h = 0, 1, 2,3 and 4, we now get 
A+B+C+D+£E+F+G+H =f-1, B+H+2I = D+F+20, 
(5.2) 
C+G6+I+J+M+2N+0 =f, E+K+L+WN = }f. 

We next return to the sum B(i, v) defined in (2.7). For future reference we 
give a list of formulas which may be derived with the aid of (5.1). To save space 
the plus signs between the consecutive terms in the right members of these 
formulas have been omitted. 


B(01)=AINOEONI, B(11)=BBJOKKOJ, 'B(41)=EJGJEMCM, 


B(51)=FKMMKFII, B(13) = B(33) =BMMDKOIL, 
(5.3) B(23) = B(63)=CINJNOGM, B(53)=B(73)=FIJLKJOH, 
B(02)=AMNMEJNJ, B(12) = B(52)=BIOFKMJK 


B(32) =B(72)=DHJOLLIM, B(42)=EICOEOGI. 


Since e = 8 in this case we have 8 = exp (277/8). In (2.6) putv = 2,” = 1 
and v = 3, m = 1. Note that (6°, 8”) is the conjugate of ¥(8?, 8), and that 
¥ (8°, 8") is the conjugate of ¥(6*, 8). If we now make use of the appropriate 
formulas in (5.3), we may readily verify that (2.14) implies 


(5.4) ¥(6",8) =a+ bi, a = B(O2) — B(42), 6 = B(22) — B(62), 
and 
(5.5) v(6*, 8) = c + d(8 + 8°), 


c = B(03) — B(43), d = B(13) — B(53) = B(33) — B(73), 
where p = a? + b? = c? + 2d’. 
In the Jacobi formula (2.15) replace a by 8 and by 6*. We thus get 
7 (8*) 7(6*)/7(8*) = 8™ (8°) r(8)/7(8°), 
7(8*) r(8)/r(8") = B™ 7(8*) r(8)/r(B*). 
By (2.13) we have 
¥(6*, B*) = ¥(6*, B*), (6°, 8) = (—1)/¥(6", 8), (6%, 8) = (—1) (6, 8). 
Hence (2.11) implies 
(5.6) (8,8) = (—1)/ 8 (6%, B), v(6*, 8?) = (—1)/ 68" ¥(6", B), 
(g™ = 2 (mod p)). 
Later we shall assume that 2 is an octavic residue of p. At this point it 
suffices to assume merely that m = 0 (mod 4). Since f is even we derive from 
(5.5) and the first equation in (5.6) the relations 
(5.7) c = B(O1l) — B(41), 0 = B(21) — Bé6l), 
d = B(11) — B(51) = B31) — B(71), 
where c and d are defined in (5.5). 
We are now in the position to apply the lemma of §2. The results thus far 
obtained provide us with explicit values for each term of the sum in the left 
member of (2.16). Consider first the case i = 0. By (2.8) 
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—1 


B(00) — B(40); 


by (5.5) and (5.7) 


c = B(Ol) B(41) B(03) — B(43); 
by (5.4) 

a = B(02) — B(42). 
Again in the case i = 1 we have by (2.8) 

0 = B(10) — B(50); 
by (5.5) and (5.7) 

d = B(11) — B(51) = B(13) — B(53); 

by (5.3) 

0 = B(12) — B(52). 
Proceeding in a similar fashion for 1 = 2 and 3 and using (5.1), we find that 
(2.16) yields 


(5.8) a+2c-—1=4A-—-4E, }d=B-—-F=D-—H, }b=C-G. 


The first equation in (5.8) will not be needed in the sequel. We remark, 
however, that it may be used to construct another proof of Theorem 2. From 
the second equation in (5.8) we get at once B + H = D + F. Comparing this 
result with the second equation in (5.2) we conclude that J = 0. By (5.5) 
and (5.7) we have 


B(11) + B(53) = B(13) + B(51). 


Substituting from the appropriate formulas of (5.3) we may verify that this 
equation reduces to the identity J = M. We have thus established the two 
simple relations 

(5.9) I=0, J=M, 


under the assumption that m = 0 (mod 4). We remark that it may also be 
proved that J = 0 when m = 2 (mod 4). 

Since ¥(8?, 8?) = (i, 7) we see from the second equation in (5.6) that the 
number a defined in (3.3) has the same sign as the number a defined in (5.4). 
Replacing the f which appears in (3.5) by 2f we get 

a= 8(12), = 4f —_ 1, 
where the meaning of the subscript is clear. Since the new f is even the sign of a 
is determined by means of the congruence a = —1 (mod 8). Let us now 
observe that a number which is of the form g**** (mod #) is of the form either 
g**** or g*+"*4 (mod p). From the definition of the cyclotomic number (h, k) 
in (2.1) it follows that 


(12), = (12)s + (16)s + (52)s + (56); = I+ M+0+J. 


Hence we get a = 87 + 8) + 8M + 80 — 4f — 1. In view of (5.9) this 
reduces to 
(5.10) a = 16] + 16J — 4f — 1. 


From (5.4) and (5.7) we get a —c = B(02) + B(41) — B(O1) — B(42). 
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Using the appropriate formulas of (5.3) together with (5.9) we find that this 
reduces to a — c = 8J — 8]. Consequently a = c (mod 8). Combining the 
last equation with (5.10) we obtain 


(5.11) 64. = p+1—2a + 4c, 


where the signs of a and ¢ are determined by means of the congruence 
a =c = —1 (mod 8). Again we point out that (5.10) and (5.11) have been 
derived under the assumption that m is divisible by 4. 

To obtain a criterion for the 16th power residue character of 2 we make use 
of the congruence 
(5.12) Qe-D/16 = gi?-v/8(] = g?)e-vs (mod pb), 


which corresponds to (4.1). We must next determine the octavic character of 
1 — g*’. The 3(p — 1) roots of the congruence x®-/* — g*/ = 0 (mod ) are 
given by the numbers 7; = g***? (mod p), i = 1,2,..., §( — 1). We there- 
fore have the factorization 


(5.13) eP-D/8 — gf = (x — ¥1)(x — 2)... (x — ¥7) (mod p). 
For x = —1 (5.13) becomes 
(5.14) 1— g¥ = (1+ y)(1 + v2)... (1+ v9 (mod p). 


Then, as in §4, we get making use of (5.1), (5.9) and (5.14) 

ind (1 — g*/) = 1(21) + 2(22) + 3(23) + 4(24) + 5(25) + 6(26) + 7(27) 
(5.15) —2I + 2G + 2/+2N (mod 8). 
Applying (5.9) to the third equation of (5.2) we have also 2/ + 2N =f —C 


—G — 2I. Using this result in conjunction with the formula for 40 in (5.8) we 
find that (5.15) simplifies to 


Il 


ind (1 — g*) =f + 4I — b/4 (mod 8). 
The congruence (5.12) may now be put into the form 
(5.16) Q(P—1)/16 = gits+rar—0/4) (mod p). 


Now we assume that the integer m defined in (5.6) is actually divisible by 8. 
This is equivalent to the assumption that 2 is an octavic residue of p. Return- 
ing to the equation 

p=e4+? = c? + 2d’, a =c = —1 (mod 8), 


we note that the congruence c? + 2d? = 1 (mod 16) implies that d=0 
(mod 4). Furthermore it is easy to verify that 2f = 0 or 4 (mod 8) according 
as a = —1 or —9 (mod 16). By Theorem 2, b} = 0 (mod 16). Hence p = a’ 
(mod 256). We consider separately two cases: 

(i) d =0 (mod 8). In this case we get c? = a? (mod 128); whence c =a 
(mod 64). Converting (5.11) into a congruence (mod 256), we find that 
64] = (a + 1)? (mod 256). If a = —1 (mod 16), then 47 = 0 (mod 16). If 
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a = —9 (mod 16), then 47 = 4 (mod 16). In either event, it is clear that 
2f + 4I = 0 (mod 8). 

(ii) d = 4 (mod 8). In this case we get c? = a? — 32 (mod 128), whence 
¢ =a+16 (mod 64). The equation (5.11) reduces to the congruence 
64] (a + 1)? + 64 (mod 256). If a = —1 (mod 16), then 47 = 4 (mod 16). 
Ifa = —9 (mod 16), then 4J = 8 (mod 16). In either event we get 2f + 4J=4 
(mod 8). 

The results derived in (i) and (ii) may be combined into the single congruence 
2f + 4I = d (mod 8). The congruence (5.16) now becomes 2@-)/!* = g/(@?/# 
=g/*+"/) (mod p). We conclude that 


90-1/16 = f§ 1, (0/16) + (d/4) = 0 (mod 2), 
~ \—1, (0/16) + (d/4) = 1 (mod 2). 


This completes the proof of 


THEOREM 3. Let p = a® + b? = c? + 2d*, a and c odd, be a prime of the form 
l6n + 1. If 2@-”/8 = 1 (mod p), then 28-916 = (—1)@/10+/9 (mod p). 


Theorem 3 is the criterion of Cunningham (4, p. 88). 
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RING ISOMORPHISMS OF BANACH ALGEBRAS 
IRVING KAPLANSKY 


1. Introduction. In discussing an isomorphism between two Banach 
algebras, one will ordinarily tacitly assume that the mapping is linear (i.e., 
preserves the complex scalars as well as the ring operations). In general this 
cannot be avoided; for instance if the two Banach algebras are just the field 
of complex numbers, then the isomorphism is unrestricted, and could be given 
by any one of the myriads of discontinuous automorphisms of the complex 
numbers. A similar remark applies generally to the finite-dimensional case. But 
if the algebras are genuinely infinite-dimensional in an appropriate sense, 
interesting results become possible. The first such theorem was proved by 
Arnold (1): if A and B are both algebras of all bounded operators on infinite- 
dimensional Banach spaces, then any ring isomorphism between A and B is 
automatically real-linear (or alternatively, it is either linear or conjugate 
linear relative to complex scalars). Kakutani and Mackey (5) used a similar 
argument in connection with their characterization of complex Hilbert space. 
Rickart (7) generalized Arnold’s theorem to the case of primitive Banach 
algebras with minimal ideals. 

In this paper we shall extend Rickart’s result to any semi-simple Banach 
algebra, the precise theorem being as follows: if @ is a ring tsomorphism from 
one semi-simple Banach algebra A onto another, then A is a direct sum A, ® A» 
® A; with A, finite-dimensional, linear on Az, and } conjugate linear on A3. 
Some of the preliminary lemmas (particularly Lemmas 7 and 9) may be of 
independent interest. 


2. Elements with infinite spectrum. Let A be a Banach algebra,' x an 
element in A. We define the non-zero spectrum of x to consist of all scalars \ 
such that —\~'x is not quasi-regular. We insert 0 in the spectrum of x unless A 
has a unit element and x is regular. 


Lemma I. If there exists a non-zero element z with zx — dz = 0, then d is in the 
spectrum of x. 


Proof. lf not, suppose y is the quasi-inverse of —A~'x, so that 
—dv—"x + y — A“Ixy = 0. 
A left multiplication by z yields the contradiction z = 0. 
Received September 8, 1953. 


1All our Banach algebras will admit complex scalars. The results can probably be extended 
to the case where only real scalars are assumed, but it seemed preferable in the present paper 
to avoid the extra complications. 
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LemMa 2. Jf a Banach algebra possesses an infinite set of orthogonal idempotents, 
then it has an element with an infinite spectrum. 


ll 


Proof. Denote the idempotents by e, and write x 
distinct numbers satisfying |A,| ||e,|| < 2-‘. Since ew 
Lemma | that \, is in the spectrum of x. 


DA.€;, where A, are 
= \,€,, it follows from 


LemMaA 3. Let e be an idempotent in a Banach algebra A. Then the non-zero 
spectrum of an element of eAe is the same, whether computed in eAe or in A. The 
same is true® for the subalgebra (1 — e) A(1 — e). 


Proof. We can cover both cases by using a symbol f for either e or 1 — e. The 
problem comes to this: given an element fxf which has a quasi-inverse in A, 
prove that it already has a quasi-inverse in fAf. Let y denote the quasi-inverse 
in A, so that 


Suf +9 + fxfy = fxf +9 + afxf = 0. 


On left and right multiplying by f, we see that fyf is likewise a quasi-inverse of 


fxf. 


Lema 4. Let A be a Banach algebra with unit element and radical R. Suppose 
that every element of A has only one number in its spectrum. Then R consists pre- 
cisely of all elements with spectrum 0, and A/R is one-dimensional. 


Proof. Let N be the set of all elements with spectrum 0; of course NV > R. 
We note that the elements of A having (two-sided) inverses are precisely those 
not in N. We are now going to prove that N is a right ideal, for which purpose 
we have two things to verify. 

(1) Ifx € Nandy € A, we have to prove xy € N. If not, then xy is regular, 
say with inverse z. Thus x has at any rate a right inverse, namely yz. But if yz 
is not also a left inverse of x, then the element yzx is an idempotent other than 
0 and 1, and has 0 and 1 in its spectrum, contrary to hypothesis. 

(2) If x,y € N, we must show x — y € N. If not, x — y = wu is regular. 
By what we have just shown, both xu~' and yu~' are in N. But then xu" 
=1 + yu-' is regular, a contradiction. 

We have thus shown that N is a right ideal. Since it consists of quasi-regular 
elements, it is part of the radical. Hence N = R, and it is immediate that A/R 
is one-dimensional. 


LemMa 5. Let A be a commutative Banach algebra having exactly r regular 
maximal ideals. Then A contains r orthogonal (non-zero) idempotents. 


Proof. Let R be the radical of A. Then A/R is the direct sum of r copies of 
the complex numbers. It is known that the r orthogonal idempotents in A/R, 


which arise in this way, can be lifted to orthogonal idempotents in A (see for 
example (2)). 


2The symbol 1 is used here formally, and does not indicate that we are assuming the presence 
: ‘ 
of a unit element. 
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Our next lemma is an elementary purely algebraic one, recorded for the 
convenience of the reader. 


LEMMA 6. Let A be a ring with unit element and no nilpotent ideals. Suppose 
that 1 = e, +... + é, with the e's orthogonal idempotents such that each e,Ae; 
is a division ring. Then A has the descending chain condition on right ideals (and 
so is the direct sum of a finite number of matrix rings over division rings). 


Proof. Since A has no nilpotent ideals, and e,Ae; is a division ring, it is known 
(4, p. 13) that e,A is a minimal right ideal. Moreover A = e:A +...+ 6A 
is a direct sum decomposition of A into a finite number of minimal right ideals. 
It follows from the Jordan-Hélder theorem that A has the descending chain 
condition on right ideals. 


LEMMA 7. In any infinite-dimensional semi-simple Banach algebra there exists 
an element with an infinite spectrum. 


Proof. Suppose that A is a semi-simple Banach algebra, and that every 
element of A has a finite spectrum. We shall prove that A is finite-dimensional. 

It cannot be the case that every element has 0 spectrum, for then A would be 
all radical. Select an element x with some non-zero spectrum, and let B be the 
closed subalgebra generated by x. One knows that the number of regular maxi- 
mal ideals in B is the same as the number of non-zero numbers in the spectrum 
of x. It follows from Lemma 5 that B contains idempotents. 

Let then e,; be an idempotent in A. By Lemma 3, the Banach algebra e,Ae; 
inherits the hypothesis that all its elements have finite spectrum. There may 
exist in e¢,Ae,; an idempotent e, other than 0 and e,, and then a third one e; 
inside ¢2Aé2, etc.; but this cannot continue indefinitely, for we would find the 
infinite set {e; — €:4:} of orthogonal idempotents, contrary to Lemma 2. 

Thus we may assume that e,Ae, contains no idempotents other than 0 and ¢,. 
For any y in e,Ae,; we form the closed subalgebra C generated by y and ¢,. 
If y has two or more numbers in its spectrum, then C has at least two maximal 
ideals (the presence of 0 in the spectrum is not treated specially here, since C 
has a unit element). Since this contradicts Lemma 5, it must be the case that 
every element in ¢,Ae, has a one-element spectrum. It is known that e,Aée,, 
along with A, is semi-simple. It now follows from Lemma 4 that ¢);Ae; is one- 
dimensional. 


Let ¢1,..., &, be a maximal set of idempotents in A such that each e;Ae; is 
one-dimensional. (That such a maximal set is finite follows from Lemma 2.) 
We write e=e,+...+e, and turn our attention to (1 — e)A(1 — e); 


according to Lemma 3, its elements all have finite spectrum. If (1 — e)A (1 — e) 
is not all radical, the preceding argument shows that we can find in it an idem- 
potent é,4; with e,,:Ae,4; one-dimensional. This contradicts the maximality of 
€1,--+,4€n. Hence (1 — e) A(1 — e) is all radical. But on the other hand it, 
like A, is semi-simple. Hence (1 — e) A(1 — e) = 0. We next observe that 
(1 —e)A is a nilpotent right ideal, and so (1 — e)A = 0 and similarly 








ert ts =~ *, 
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A(1 — e) = 0. In other words, e is a unit element for A. We are now ready to 
apply Lemma 6. In the light of the Gelfand-Mazur theorem that all Banach 
division algebras are one-dimensional, we conclude that A is finite-dimensional. 
This completes the proof of Lemma 7. 


3. A remark on ideals. Our program is to study mappings which are 
isomorphisms purely in the ring-theoretic sense. It is therefore important for 
us to know that certain ideals (such as primitive ideals) which are defined with 
no reference to scalars, are automatically algebra ideals when they occur in an 
algebra. Actually, for later purposes we wish to consider more generally the 
admissibility of the ideal under general operators, where by an operator on a 
ring we mean an additive endomorphism that commutes with all left and right 
multiplications. The principal fact is given in the following lemma. 


Lema 8. Let A be any ring and M a regular maximal right ideal in A. Then 
M automatically admits any operator on A. 


Proof. Let the operator be denoted by @ and placed on the left. Then for 
x € M we have to prove 6x € M. Let e be a left unit modulo M, so that 
ey — y € M for all yin A. If 6x is not in M, then the right ideal generated by 
6x and M must be all of A. In particular we have 


e=O6x(a+n)+m 


where a € A,m € M, and n isan integer. On right multiplying by e and com- 
muting @ past x we get 
‘ 


e? = x(0ae + One) + me € M. 


Now e? — € 
diction. 

From Lemma 8 we deduce that any primitive ideal is automatically operator- 
admissible. More generally, if J is any two-sided ideal such that A/J is semi- 
simple then J is admissible, for in that case J is an intersection of regular maxi- 
mal right ideals. 


M; this tell us that e is in M, whence M is all of A, a contra- 


4. Thecentroid. By the centroid* of a ring A we mean the set of all operators 
on A where, as above, an operator is an additive endomorphism commuting 
with all left and right multiplications. If A has a unit element, the centroid is 
easily seen to coincide with the ordinary center of A. 

Suppose that A is an algebra over a field F. Then the elements of F form part 
of the centroid. We call A central if the elements of F in this way form all of 
the centroid. 


LEMMA 9. Any primitive Banach algebra is central. 


’This term (used by Artin in a Princeton seminar) seems better than earlier ones that have 
been used, such as “multiplication centralizer” in (3). 











378 IRVING KAPLANSKY 


Proof. Let M be a regular maximal right ideal such that A is faithfully 
represented by right multiplication on A/M. We propose to invoke the theory 
of the eigenring B of M, for which we refer the reader to (3, p. 236) and (6, 
Lemma 3). We summarize briefly: B is defined to be the set of all a in A with 
aM C M, M isa two-sided ideal in the subring B, B/ M is a division ring and in 
fact coincides in a natural way with the ring of all endomorphisms of A/M 
commuting with all right multiplications on A/M. 

Now in our case we have further that M is an algebra ideal (Lemma 8). 
Also, any regular maximal right ideal in a Banach algebra is closed. It follows 
further that B is closed and that B/M is a Banach division algebra which, by 
the Gelfand-Mazur theorem, is simply the complex numbers. 

Let an element 6 of the centroid be presented. By Lemma 8, @ sends M into 
itself and accordingly induces an additive endomorphism of A/M. This latter 
manifestly commutes with all right multiplications by elements of A. We are 
thus led to a certain element of B/M, i.e. to a complex number \. The infor- 
mation we have is that for any a in A, 6a — Xa € M. Then for a further ele- 
ment x in A 

Oxa — A\xa = x(0a — ra) € M. 


Thus right multiplication by 6a — Xa sends A into M, and induces the 0 map 
on A/M. Since the representation of A on A/M is faithful, we have 6a — A\a=0. 
The centroid of A therefore coincides with the complex numbers and A is 
central. 


5. The primitive case. Next we need a lemma which is concerned with the 
construction of an entire function with desired properties. 


LemMA 10. Let {A;} be a sequence of distinct non-zero complex numbers. For 
each i let there be given a discontinuous automorphism a, of the complex numbers 
(the o’s need not be distinct). Then there exists an entire function f, vanishing at 
0, such that the set {a ,[f(A,)]} is unbounded. 


Proof. The function f will be constructed as a sum 2g, of polynomials. 
Assuming that gi, ..., Zn-1 have been selected, we take 


2n = c2(z — Ay)... (2 — Ag-1)- 
The coefficient c is to be chosen so that 
(1) lgn(z)| <2 for |z| <n, 
(2) onlgi(An) + ~~ + gu(An)]| > 2. 


Of course (1) merely requires that c be suitably small. This having been ar- 
ranged, we can achieve (2), for it is known that a discontinuous automorphism 
o, is unbounded on any open subset of the complex plane. 

By (1), the sum 2g, converges uniformly on any bounded set. Hence the 
sum f is an entire function. Since g;(A,) = 0 for i > nm, the terms from the 
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(n + 1)-st on do not disturb (2) and we have |¢,[f(A,)]| > ”. Finally f(0) = 0 
since each g, vanishes at 0. 


We can now dispose of the primitive case. 


LemMMA 11. Let A and B be infinite-dimensional primitive Banach algebras. 
Then any ring isomorphism from A onto B is automatically real-linear (and 
hence either linear or conjugate linear relative to complex scalars). 


Proof. The given isomorphism induces an isomorphism between the cen- 
troids of A and B (this is a general fact about isomorphisms between rings). 
By Lemma 10 the centroid of both A and B is just the complex numbers. So the 
isomorphism between the two centroids is describable as an automorphism ¢ 
of the complex numbers. Our problem is to prove that ¢ is continuous, and we 
suppose the contrary. By Lemma 7, A possesses an element x with an infinite 
spectrum. Let Ai, Ao, ... be distinct non-zero numbers in the spectrum of x. 
Apply Lemma 10 (with all the o,’s equal to a). If f is the resulting entire func- 
tion, then f may be applied to x to yield a well defined element y in A, and the 
spectrum of y contains f(A), f(A2), .. . . If we write y’ for the image of y under 
the isomorphism, the spectrum of y’ will contain the numbers o[f(A,)]. But 
these numbers are unbounded, whereas in any Banach algebra the spectrum of 
every element is bounded. This contradiction shows that ¢ must be continuous, 
and completes the proof of Lemma 11. 


6. The main theorem. The next step in the discussion is to show that dis- 
continuous automorphisms of the complex numbers can arise at only a finite 


number of primitive ideals. We first need a lemma somewhat analogous to 
Lemma 2. 


LemMA 12. Let A be a Banach algebra, { M,;} an infinite set of regular maximal 
two-sided ideals in A. Then we can find an element x in A and distinct non-zero 
complex numbers \, such that d, is in the spectrum of* x(M,). 


Proof. By the Chinese remainder theorem there exists an element y, with 
y:(M,) = 1, ¥:(M,) = 0 for 7 < 1. We proceed to choose numbers a, and X, in 
succession. Having selected them up to i — 1, we take a; satisfying 

0 <a, < 2-‘Iy,II, 
and such that (ayy; + ... + ay,)(M, has dA, ¥ 0 in its spectrum, where X, is 
any number different from A, ..., Ay-1. We define x = Layy;. Since the terms 
from y;4; on map into 0 mod M,, x has the desired property that x(M/,) has d, 
in its spectrum. 

Before stating the next lemma, we consider the following situation: ¢ is a 
ring isomorphism of a Banach algebra A onto a second one B, P is a primitive 
ideal in A, and Q = ¢(B). Then (see the remark after Lemma 8) P is an algebra 
ideal in A. Moreover it is known that P is closed; indeed P is an intersection of 
regular maximal right ideals and the latter are closed. Thus A/P is a primitive 


‘The notation x(M) denotes the image of x in the natural homomorphism from A onto A/M. 
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Banach algebra. The same is true of B/Q, and we observe that ¢ induces a ring 
isomorphism of A/P onto B/Q. 


LemMA 13. Let A be a Banach algebra and ¢ a ring isomorphism of A onto a 
second Banach algebra B. Then there can exist only a finite number of primitive 
ideals P in A such that the isomorphism induced by ¢ on A/P is not real-linear. 


Proof. Suppose on the contrary that there are an infinite number of such 
primitive ideals and denote them by P;. Then by Lemma 11 each A/P, is 
finite-dimensional (and hence a total matrix algebra). In particular P,; is 
regular maximal. We now apply Lemma 12 and produce an element x in A and 
distinct non-zero numbers A, such that, for all i, x(P;) has \, in its spectrum. 
Write oc; for the discontinuous automorphism of the complex numbers asso- 
ciated with the isomorphism of A/P;. We apply Lemma 10, and write y = f(x) 


with f the entire function given there. Then y(P,) has f(A,) in its spectrum. 
Passing to the algebra B, we observe that the image of ¢(y) in B/¢(P,) has 
o,{f(A;)] in its spectrum. Then further all these numbers o;[f(A,)] lie in the 
spectrum of ¢(y) itself. This contradicts the boundedness of the spectrum of 
$(y). 


We shall now state and prove the main theorem of the paper. 


THEOREM. Let A and B be semi-simple Banach algebras and let be a ring 
isomorphism of A onto B. Then we may write A = A; ® A2@ A; with A, 
finite-dimensional, @ linear on A>, and @ conjugate linear on A3. 


Coro.uary. If A has no finite-dimensional ideals, then any ring isomorphism 
of A onto B is automatically real-linear. 


Proof. We first single out (by Lemma 13) the finite number of primitive 
ideals P,,..., P,in A such that the induced isomorphism on A /P; is not real- 
linear. We shall prove that P, is a direct summand of A, or rather the equi- 
valent statement that Q; is a direct summand of B, where Q; = ¢(P;). By 
Lemma 11, each A/P, is finite-dimensional, whence P; is regular maximal. By 
the Chinese remainder theorem, there exists in A an element x with x(P;) = 1, 


x(P2) =... = x(P,) = 0. For any other primitive ideal P we have ||x(P) 
< ||x!|, and a fortiori the spectrum of x(P) is bounded by ||x/|. Write o for the 
discontinuous automorphism of the complex numbers attached to the iso- 
morphism on A /P;. There exists a complex number A such that |#(A)| >2)A) ||x)|. 
Write y = ¢(Ax)/o(A). Then y(Q:)=1, y(Q.) =...=y(Q,) =0. Let 
Q = ¢(P) be any other primitive ideal in B. Since the induced isomorphism 
from A/P onto B/Q is real-linear, it preserves the absolute value of the 
spectrum of any element. Since the spectrum of x(P) is bounded by ||x|| we 
compute that the spectrum of y(Q) is bounded by ||x\| |A\/|o(A)| < 4. We 
apply to y the Cauchy integral, in the appropriate version for algebras that 
may lack a unit element: 
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Here C may be taken to be a circle of radius } about 1, and the prime denotes 
quasi-inverse. Then by known properties of this integral we have that e(Q,) = 1, 
while e(Q) = 0 for every other primitive ideal in B (including of course 
Q:2,...,Q,). Because of the semi-simplicity of B it follows that ¢ is a central 
idempotent, and indeed we have the desired direct sum decomposition 
B=(Q, @ eB. 

Transferring this back by the inverse of ¢, we have likewise that P, is a 
direct summand of A. We may of course subject P2,...,P, to the same 
treatment. The result is to reduce our problem to the case where ¢ is already 
real-linear, and we accordingly make that assumption in the rest of the proof. 

This last portion of the proof is purely algebraic, and is best understood by 
making use of the centroid. Multiplication by i is an operator on the ring A 
and thereby gives rise to a centroid element. Likewise we get an element of the 
centroid of B from multiplication by 7. This latter may be transferred to the 
centroid of A, via the given isomorphism of A and B. We now have two 
centroid elements of A, both with square equal to —1. The vital thing is to 
know that they commute, for then their product will be a centroid element with 
square 1. Such an element splits A into a direct sum Az @ A; of ideals, on the 
first of which it is the identity, on the second the negative of the identity. This 
is the decomposition we are seeking. 

So it remains only to convince ourselves that the centroid of a semi-simple 
ring is commutative. Here is a stronger result: if a ring A has no total left 
annihilator other than 0, then its centroid is commutative. Given x in A and 
6;, 02 in the centroid, we have to prove 6;02x = 626,x. It is enough to prove that 
this holds after a right multiplication by y. By repeated use of the fact that the 
6's commute with left and right multiplications we find: 


[(0:02)x]y = (0:02) (xy) = 0,(x-Ooy) = O:x-Bey, 
[(020,)x]}y = (0201) (xy) = 02(0:x-y) = O;x-Bey. 


With this the proof of the theorem is complete. 
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THE RELAXATION METHOD FOR 
LINEAR INEQUALITIES 


SHMUEL AGMON 


1. Introduction. In various numerical problems one is confronted with the 
task of solving a system of linear inequalities: 


(1.1) L(x) = DL apt, + bi > 0 (¢ = 1,...,m), 


assuming, of course, that the above system is consistent. Sometimes one has, 
in addition, to minimize a given linear form /(x). Thus, in linear programming 
one obtains a problem of the latter type. To cite another example, this time 
from analysis, the problem of finding the polynomial of best approximation of 
degree less than m corresponding to a discrete function defined in N points is of 
the latter type. In this paper we shall be dealing only with the simpler problem 
of finding a solution of (1.1). Nevertheless, it is known (7, lectures IV and V) 
that the more difficult problem of minimization can be reduced to a system of 
inequalities involving no minimization by the duality (or minimax) principle. 
(However, this will increase considerably the number of unknowns and 
inequalities in the equivalent system.) 

That the numerical problem of solving a system of inequalities is in general 
no easy task could be inferred from the fact that even in the case of equations 
the numerical solution is not easy, and that many ingenious methods were 
devised (2) in the hope of obtaining at least an approximate solution in a 
“‘reasonable’”’ number of steps. The situation is much worse in the case of 
inequalities. Of the existing methods one could mention the double description 
method (3) and the simplex method due to Dantzig (1). The elimination 
method (proposed already by Fourier) is ruled out in general due to the huge 
number of elementary operations involved. 

We propose to discuss here an iteration procedure of finding a solution of 
(1.1). The idea of the algorithm involved was communicated to the author by 
T. S. Motzkin.' This method, which uses orthogonal projection, will be seen 
later to be intimately connected with the so-called relaxation method in the 
case of equations (4; 5; 6), and it could be considered (after a suitable trans- 
formation) to be the extension of this method to inequalities. Even in the case 
of equations it seems to us that our results are not completely devoid of 
interest, for we shall get a simple geometric proof for the convergence of the 


Received May 27, 1953. The preparation of this paper was sponsored (in part) by the Office 
of the Air Comptroller, U.S.A.F. 


‘It was through valuable conversations which the author had with T. S. Motzkin that he 
was led to consider the problems treated here. 
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relaxation procedure which will hold even if the (consistent) system of 
equations has a singular matrix, and we shall also establish the rate of con- 
vergence, a feature which had been absent in previous proofs (compare 6). 


2. Preliminary remarks and lemmas. When considering the system (1.1) 
it will be convenient to use a geometric language. Thus we shall look upon 
x = (x,,...,%,) aS a point in n-dimensional Euclidean space, E,, and each 
of the inequalities (1.1) as defining a half-space. The set of solutions will 
therefore consist of a convex polytope which we shall denote by 2. We shall 
also say that /,(x) = 0 defines an oriented hyperplane 2,; —/,(x) = 0 is 
oppositely oriented. A point x will be said to be on the right side of x, if 
l,(x) > 0, and on the wrong side of x, if 1,(x) < 0. It is clear that the set of 
solutions contains all those points x which are on the right side of all oriented 
hyperplanes. 


The following simple geometric lemma is basic in our discussion: 


LEMMA 2.1. Let x and y be two points in E, separated by the oriented hyper- 
plane x where x is on the wrong side of x and y is on the right side of x. Let x’ 
be the orthogonal projection of x on x. Then, if 0 < X < 2, we have 
(2.1) lx + A(x’ — x) — y| < |x — yl, 
where equality holds only for } = 0 or } = 2 and y on x. 


Proof. Consider the two-dimensional plane T through x, x’ and y. It cuts 
the hyperplane z in a line r. Clearly r separates x and y in T, and ~x’ is the or- 
thogonal projection of x on r. The statement (2.1) is now obvious from the 
geometric configuration. 

Alternatively, to prove the lemma analytically we may assume that = is the 
hyperplane x; = 0, and that x is the point (¢,0,...,0) with — < 0, and y is 
the point (m,...,,) with 9; > 0. Then x’ = (0,...,0). Hence we have 


n 


) : AL 
(2.2) |x + A(x’ — x) —y| = |(L—A)x—y| = +1 —rA)t—m) + De nif ; 


, 1 f 2 ~ ot) 

(2.2’) le— yl = VE -—m) + Diag. 
The result is now obvious since [(1 — A)E — m]* < (€ — m)*, and equality 
holds only if AX = 0 or A = 2 and n; = 0. 

For future reference we note that (2.2) and (2.2’) imply the following more 
precise form of (2.1) for0 < A < 2: 
(2.3) lx + A(x’ — x) — yl? < |x — y|? — [1 — (1 — A)*] |x’ — x]? 

LEMMA 2.2. Let Q be a polytope defined by the inequalities (1.1) none of which 
is superfluous. Let x be a point exterior to 2 and let y be the nearest point to x on 


In what follows we do not distinguish between the point x and the vector joining the origim 
to this point, and whose magnitude we denote by | x | . 
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aQ. (The boundary of 2, denoted by dQ, consists of those points of Q which lie on at 


least one of the hyperplanes w,). Let i, (k = 1,..., 5) be the sub-set of indices for 
which l,(y) = 0, and let Q, be the polyhedral cone defined by 
(2.4) li (x) > 0 ards. s a8 


Then x is exterior to Q, and y is also the nearest point to x on dQ,. 


Proof. Let us assume, on the contrary, that x is not exterior to Q, and 
consequently is on the right side of all oriented hyperplanes z,,. It follows from 
this and from the fact that y ison the right side of all hyperplanes x ,(i=1, ..., m), 
that any 7; (t # i) having x on its wrong side intersects the open interval xy 
at a point y,. At least one such hyperplane exists since x is exterior to 2. Let y* 
be the nearest y; to y. Then it is easy to see that y* € 0. But this leads to the 
contradiction: |x — y*| < |x — y|, which establishes the first part of our 
contention. 
pp Let now g be any point on @Q, different from y. Obviously the whole seg- 
ment jy is contained in 0Q,. Also, there exists a spherical neighborhood JN, in 
E,, around y, such that its points are on the right side of all hyperplanes 7, 
with i + i. Thus, the segment which is the intersection of N, and the seg- 
ment jy is contained in dQ. In particular’ there exists a point y’ € 82 which is 
between g and y on the segment gy. We therefore have: |x — y’'| < a|x — y! 
+ are| x — g| with a: + a: = 1, a; > 0, a2 > 0. But since |x — y'| > |x -- y| 
(y being the nearest point to x on dQ) we conclude that |x — g| > |x — yI. 
This proves the second part of the lemma. 


LemMA 2.3. Let a polyhedral cone C be given by: 


(2.5) L(x) = >} ayx,>0 (i = 1,...,m). 
j=1 


Let E be the set of points x such that: 
(a) x is exterior to C. 
(b) The nearest point to x on OC is the origin. 
Let us denote by i(x) the subset of indices for which 1,(x) < 0, and by d,(x) the 
distance of x from the hyperplane x;:1,;(x) = 0. Then 


(2.6) Inf max —-~ = X(C) > 0. 
x€E 1(x) |x| 


Proof. From the homogeneity of (2.5) it follows that there is no loss of 
generality in replacing E by the subset E* consisting of those points of E 


which are also on the unit hypersphere: |x| = 1. The set E* is clearly compact. 
But, for any x € E*, we have 

max d(x) = max d,(x) > 0, 

w(x) |*| 1(x) 


since x is on the wrong side of at least one hyperplane. This and the compact- 
ness give (2.6). 
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3. The method of orthogonal projection. We shall discuss here a special 
iteration procedure to solve (1.1) which we shall call the method of orthogonal 
projection, and where the sequence of iterates is defined in the following way: 


x) is arbitrary; 
xOtD = x if x™ is a solution of (1.1); 


x’*) is the orthogonal projection of x“ on the farthest hyperplane x, with 
respect to which it is on the wrong side, if x is not a solution of (1.1). (If 
this hyperplane is not unique one chooses one of the hyperplanes with respect 
to which x” is on the wrong side and whose distance from x” is maximum.) 

Numerically, if x is not a solution we consider all indices 7’ for which 
ly (x™) <0, and among them pick an ip for which —/,(x™)/\a,| has its 
greatest value; a, being the vector: (@,;,...,@,,). Then, x°*? = x + ta,, 
where ¢ = —1,,(x”)/|a,,|*. 

We shall establish now: 


THEOREM 3. Let (1.1) be a consistent system of linear inequalities and let 
{x"”} be the sequence of iterates defined above. Then x‘) — x where x is a solution 
of (1.1). Furthermore, if R is the distance of x“ from the nearest solution, we have 


(3.1) lx“? — x| < 2 Ro’, v=0,1 


where 0 < 6 < 1 depends only on the matrix (a,;). 


Proof. We first claim that if y is a solution of (1.1) then x approaches y 
steadily, or, more precisely 


‘ (»+1) 2 1..(*) 2 1 (r+1) (»))2 
(3.2) rr — yl SS [x — yl” — |x —x” |’. 


Indeed, (3.2) is trivial if x“ is a solution. If x™ is not a solution then (3.2) 
follows from the refinement (2.3) of Lemma 2.1 with x = x”, x’ = x +” and 
= 1. (We use also the fact that y is a solution and hence is on the right side 


of 7.) 


Let us consider now the polyhedral cone C,,,....«, defined by 
(3.3) > a4; >0 (k= 1,...,5), 
j=l 
where i, is a subset of the set i = 1,...,m, (a,,,;) being a submatrix of the 


matrix (a,,;) of (1.1). Let A,,.....4, be the “norm” associated with C,,.....4, 
which was introduced in (2.6). Then, by Lemma 2.3, A,,.....;, > 0. Therefore: 


(3.4) = min Ay “eee ts > 0 

Sarcces 1, 
where the minimum is taken over all possible choices of 4,...,%, from 
4$=1,...,m. Let 


gnVi-w, 


and let us denote by y” the nearest point to x‘” on dQ. We assert that 
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(3.5) jx? — yy! < ala — y™| (» = 0,1,...). 


Indeed, let 2,,,) be the polyhedral cone of Lemma 2.2 generated by the ori- 
ented hyperplanes z,; containing y’”. From the lemma it follows that x is 
exterior to Q,,,), and that y is also the nearest point to x” on dQ,,,). Translat- 
ing y*” to the origin, and applying Lemma 2.3, taking into account (3.4) and 
the fact that |x°+» — x | is the distance of x‘ from the farthest hyperplane 
with respect to which it is on the wrong side, we find that 





’ 


(3.6) |x°t? — «| > max dist (x, #,) > V'[x — y| > lx — y” 
i’ (x“”) 

where 7’(x) has the meaning of Lemma 2.3 and }’ is the associated “‘norm”’ of 

(2.6), the “primes” indicating that we are dealing with the polyhedral cone 

Q,). Combining now (3.6) and (3.2) we get 

(3.7)  _— _ y” |? < Ix” an y |? = pu |x” oa y |? 

which establishes (3.5). 

The remainder of the proof now follows easily. We first note that the iterates 
x, x). ..,2,...are all included in the hypersphere S® : |x — y| 
< |x — y| = R. This follows from (3.2). For the same reason x™,... , x, 
... lie in the hypersphere S™ : |x — y| < |x — y®). But since y™ is the 
nearest point to x“ on dQ, and on account of (3.5), we have 


| ai 


jx — | <x — y| < OR. 


In the same way we get that x, x°+”, ... are contained in the hypersphere 
S® |x — y| < |x — y| << @ R. It is now evident, since we have a 
sequence of hyperspheres with non-zero intersection and whose radii tend to 
zero, that 








@ 
N s® = x. 
v=0 


Thus we get lim x = x = lim y® which proves the convergence of x” to a 
solution of (1.1). Moreover, since both x” and x belong to the hypersphere S“ 
whose diameter does not exceed 280’, (3.1) follows, and the proof is complete. 


In the above theorem we have established that the rapidity of convergence 
of the iterates to the solution is at least linear. However, the positive constant 
u appearing in the definition of @ was obtained from Lemma 2.3 where we had 
only an existence statement. More elaborate considerations can give a lower 
bound to yu in terms of the matrix (a;,). Let C = (c,,;) (i,7 = 1,...,r) bea 
rectangular matrix. We shall denote by |C| the determinant of C, and by I'(C) 
the expression 


(38) ro =[ 5 (x Icu) | 


j=l = 


where the C,,’s are the cofactors of the elements c,,;. With this notation the 
following result may be established: 
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THEOREM. If in Theorem 3 the 1,(x) are normalized so that 
n 
(3.9) Dh ai, = 1, 
j=l 


and if r is the rank of the matrix (a,;), then 





j ; 
(3.10) o> min) Date-er | /[ Deer 
1 

where the summations are taken over the range 1 <j, <... <j, <n, and 
i;,...,4, are r linearly independent rows of (a,;) (the rows are held fixed in the 
brackets) ; 

Fee 4 
is the r X r sub-matrix formed by the indicated rows and columns and 

aaa 


is the associated quantity (3.8), while the minimum is to be taken over all different 
combinations of the i's which correspond to linearly independent rows. 


We omit the somewhat lengthy proof of this theorem. We remark only 
that in its proof we make use of the invariance under orthogonal transforma- 
tions of the numerator and denominator in (3.10), and use induction with 
respect to 7. 


4. More general procedures. The method discussed above admits different 
variants which all yield (when the system is consistent) a sequence {x‘”} of 
iterates converging to a solution of the system. The following are few examples 
which may prove useful in computations. In all these cases the convergence is 
proved easily and the rate of convergence to the solution y is found to be: 
|x — y| = O(6") for some 0 < 6 < 1. 


(i) The maximal residual method. This method differs only slightly from the 
method of orthogonal projection of §3. Instead of choosing x‘’*+” as the ortho- 
gonal projection of x” on the farthest hyperplane 7, with respect to which it is 
on the wrong side, we choose x‘’*” as the orthogonal projection of x” on this 
hyperplane 2, for which the negative residual /,(x°”) is the greatest in absolute 
value. The two methods coincide if the system (1.1) is normalized so that 


(3.9) holds. 


(ii) Over and under projection with a fixed ratio. Here the iterates are defined 
in the following way: x°*+? = x if x™ is a solution of (1.1), or, if x is nota 
solution we let 


yor) _ x” + (1 + Bye = x” : 


where £” is the projection of x‘ on the farthest hyperplane with respect to 
which it is on the wrong side, and where 0 < || < 1. We say that we over- 
project, or underproject, according as 8 > 0 or 6 < 0. 
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In connection with the last procedure we remark that it is very plausible that 
overprojecting with a small positive constant 8 will accelerate the convergence 
of {x}. Indeed, slowness in the convergence of the method of orthogonal 
projection as described in §3 may arise if some of the “solid angles’’ of the poly- 
tope Q are very small. Overprojecting has the effect of opening the angles and 
this may have the effect of accelerating the convergence. 


(iii) Systematic projection. In this procedure the m hyperplanes x; : /,(x) =0, 
are arranged in a periodic infinite sequence z, (v = 1, 2,...) where x, = 2; 
and I,(x) = 1,(x) if vy = i (mod m). The sequence of iterates x (v = 0, 1,...) 
is defined then in the following way: x is arbitrary; x°*? = x if x™ is on 
the right side of z,,, (i.e., if 1,.4:(«) > 0) while if x™ is on the wrong side of 
m»41, then x°* is the orthogonal projection of x“ on ,¥41. 


5. The equivalence of the (generalized) relaxation method, and the method 
of orthogonal projection. The method of projections described in the last two 
sections can of course be applied to equations by replacing each equality by a 
pair of inequalities. An equivalent procedure would be to change slightly the 
algorithm defining the points x“ by considering the absolute value of all 
residuals and not only the negative ones. We shall now describe a procedure 
which will be the generalization of the relaxation method to inequalities, a 
procedure which we assume the reader to be familiar with in the case of equa- 
tions. Let 


(5.1) Ly) = DX 810, + b,>0 (@=1,...,m) 


be a set of m linear inequalities in m unknowns having a symmetric and positive 
semi-definite matrix G = (g,,). Clearly, one may assume that no row of G is 
identically zero. This, together with the previous assumptions, will imply that 
Zin > O. Let e, (¢ = 1,...,m) be the m unit vectors directed along the axes 
in the E,, space, and let us define the sequence { y”} by the following iteration 
scheme: 

y™ is arbitrary; 
(5.2) yorrD = y™ if y™ is a solution of (5.1); 

yorrD = y™ + te,, if y™ is not a solution of (5.1), 


where i, is such that 
Li(y”) <0, —Li,(y) = max (—L,(y)), 
and ¢ is a scalar chosen so that 
Lily") = 0, 
or, more explicitly, 
(5.3) te = — Lay") /g1,1, 


The above procedure can be considered as the extension of the relaxation 
method to inequalities. We shall now establish the following theorem: 
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THEOREM 5. If the previously discussed system (5.1) is consistent, then the 
sequence |y‘”} tends to a solution of the system. Moreover, we have 
(5.4) ly? — y| < Kly — ylo’, »=0,1,..., 
where 0 < @ < 1, and where the constants K and @ depend only on the matrix G. 





Proof. We shall show that the relaxation procedure can be interpreted as an 
orthogonal projection procedure,which will enable us to use our previous results. 
Since G is symmetric and positive semi-definite, there exists a real matrix 


A = (a,,;) (i,7 = 1,...,m) such that 
G = AA*, 
where we denote by A* the transposed matrix. Let us introduce the new 
variable x = (x;,..., Xm) connected with y = (y1,..., 9m) by 
(5.5) x = yA. 


We shall associate with the system (5.1) (which in matrix notation can be 
written as yAA* + b > 0) the system 


(5.6) L(x) = > aye, +b, > 0, bertiiccs m, 
j=l 


or in matrix notation 


xA*+6>0. 


Obviously if (5.1) is consistent the same will be true for the system (5.6). Let 
us define the sequence {x‘”} by: 


(5.7) x? = yA, v=0,1 
y 


We claim that {x} is also a sequence obtained from (5.6) by the method of 
orthogonal projection (i) of §4. Indeed, we note that L,(y”) = 1,(x) so that 
the residuals are the same for the two systems. Now, if y” is a solution of 
(5.1) then x is a solution of (5.7), and x = x, 7 > ». If y™ is not a solu- 
tion, then: 

-_ y” ee ty €:, 
where 


—Li(y”’) = max (—L,(y"”)), Li (y"*”) = 0). 
1 


Obviously, we have also: 
—1, (x) = max (—1,(x)), 1 (x°*”) = 0, 
1 
so that x“ is replaced by the point x°’*” situated on the hyperplane x,, 
corresponding to the negative residual with the largest absolute value. Finally, 
we have 
(»+1) (») 
x —x =t@,A = t,(aiz,,..-,Omi,), 


which shows that x‘’+” is indeed the orthogonal projection of x” on x,.. 
pro} » 
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But, we have pointed out in §4 that the sequence {x} converges to a 
solution x of (5.6). More precisely, in the same manner as (3.1) was established 
one shows that: 


(5.8) ix — x| < 2R0”, 


where R is the distance of x from the set of solutions of (5.6), and where 
0 < @ < 1 depends only on A. Assuming the non-trivial case where y is not a 
solution, we may write 


(5.9) 0> Li) =1, ee) =1,(x) + > a4,(x;" — x4). 
= 


But, /;, (x) > 0 and 








m i | | | 
(5.10) )» a4, 5(x)” —%)| < (> o,) je” a @ g1,1," x” — x| 
j=l i | | | 
so that if we define 
(5.11) Q = max g,,;' and g = ming,, 
1 t 
we get from (5.8)-(5.11): 
(5.12) ILi,(y)| < 2RQ’. 
Combining (5.12), (5.3) and (5.2), we find that 
~ » ’ 1 ’ | 2 , 
(5.13) yor? — |< IL )I/8u, < AF RO (» =0,1,...), 


from which follows the convergence of y” to a point y. Since the solution x of 
(5.6) is related to y by (5.5), y is also a solution of (5.1). Finally, we may 
write: 











9 © 9 
<r Ho = Pre’ 
qd j=? 1—?# 


In order to get the exact statement (5.4) we note that 


m m 2\3 
R<|x- ad > (= | = ais(¥i — .°)) 


j=1 t=1 
af 2 om WN (0) . 
<{E[E o.- vr Zeu]f = v-s(Ee) 


; 
=|y- (> eu) < ly — x |miQ, 


which, when combined with (5.13), gives (5.4) where 





of 





RELAXATION FOR LINEAR INEQUALITIES 391 


We have discussed above one type of relaxation. Similar results hold for 
other types, such as relaxation with maximum change of |y°+” — y| (this 
corresponds to the method of orthogonal projections of §3), over and under 
relaxation and the method of systematic relaxation. It is also obvious that the 
results hold true for the case of equations after we make the necessary change 
in the algorithm defining {y”}. 

We shall now proceed to show that conversely the orthogonal projection 
method can be interpreted as a relaxation procedure, at least when the initial 
point x is suitably chosen. We shall suppose that a consistent system of 
inequalities (1.1) is given. Or, in matrix notation, 


xA*+b5>0, 
where A is an m X n matrix, and x = (x,...,%,) a row vector. Let us intro- 
duce the new variable y = (y:,..., ¥m) which will again be connected with x 
by 
x = yA. 


Let us also consider the associated system 
(5.14) yAA*+b>0, 


which, when expanded, has the form of (5.1), where G = AA* is an m X m 
symmetric and positive semi-definite matrix. Let now x be a starting point 
of the form 

(5.15) x a yA, 


and let us define x,...,x,... by the orthogonal projection method (i) of 
§4. That is, x“’*” is the orthogonal projection of x“ on the hyperplane z;,,: 
1,,(x) = 0 corresponding to the negative residual /,(x‘”) with the largest abso- 
lute value. We shall also define a corresponding sequence y‘” in E,, as follows: 
y™ is the chosen starting point in (5.15), y“*” is the projection of y on the 
hyperplane L,,(y) = 0 in the direction parallel to the y;, axis (i, being the 
sequence of indices associated with the x”’s). It is now easy to see (using 
(5.2)) that 
x” _ yA, 


and that, moreover, the sequence {y°”} may also be obtained by the relaxation 
scheme (5.2). Now, since in the proof of Theorem 5 we did not use the consis- 
tency of the system (5.1), but only that of (5.6), we may use the same proof to 
obtain again that y®” converges to a solution y, and that (5.4) holds. Thus, we 
have established the equivalence of the two methods, and have also obtained, 
as a by-product, that the two systems (1.1) and (5.14) are either both consis- 
tent or inconsistent. 








392 SHMUEL AGMON 


REFERENCES 


1. G. B. Dantzig, Maximization of a linear form whose variables are subject to a system of linear 
inequalities (U.S.A.F., 1949), 16 pp. 

2. G. E. Forsythe, Solving linear algebraic equations can be interesting, Bull. Amer. Math. Soc., 
59 (1953), 299-329. 

3. T. S. Motzkin and H. Raiffa, G. L. Thompson, R. M. Thrall, The double description method, 
in Contributions to the Theory of Games, Annals of Mathematics Series, 2 (1953), 51-74. 

4. R. V. Southwell, Relaxation methods in engineering science (Oxford, 1940). 

5. , Relaxation methods in theoretical physics (Oxford, 1946). 

6. G. Temple, The general theory of relaxations applied to linear systems, Proc. Roy. Soc. London, 
169 (1939), 476-500. 

7. Linear programming seminar notes, Institute for Numerical Analysis (Los Angeles, 1950). 





The Rice Institute, Houston, Texas 
Uniwersity of California at Los Angeles 
National Bureau of Standards at Los Angeles 
The Hebrew University, Jerusalem 





— a r/? 


~~ ce 











THE RELAXATION METHOD FOR 
LINEAR INEQUALITIES 


T. S. MOTZKIN anp I. J. SCHOENBERG 


I. STATEMENT OF PROBLEM AND MAIN RESULTS 


1. The relaxation method. Let A be a closed set of points in the n-di- 
mensional euclidean space E,. If p and p; are points of E, such that 


(1.1) lp — al > |p, — al, for every a € A, 


then p; is said to be point-wise closer than p to the set A. If p is such that there 
is no point p; which is point-wise closer than p to A, then ? is called a closest 
point to the set A. In 1922 Fejér (2) made the interesting observation that the 
set of closest points to A is identical with the convex hull K(A) of the set A. 
We have mentioned this remark because it will suggest a way of dealing with 
our main problem, to which we now turn. 

We are given a consistent system of m linear inequalities 


(1.2) La; + bi > 0 (i= 1,...,m). 
The coefficients a,, and b, being given numerically, the problem is to devise a 
numerical procedure which will furnish a solution (x;,...,%,) of the system 
(1.2). In the case of a homogeneous system, i.e. when all 5, = 0, we add the 
obvious requirement that the solution (x, ...,x,) obtained be different from 
the trivial solution (0,... , 0). 


A natural approach to this problem will be suggested by Fejér’s idea as 
soon as we place the problem in its customary geometric setting. Each of the 
inequalities (1.2) defines a closed half-space 


(1.3) Hy: Laity + bi > 0, 


in terms of which the set of points corresponding to the solutions of (1.2) is 
identical with the convex polytope 


m 


(1.4) A= fH, 

t=1 
which is assumed from the outset not to be void. Let p ¢ A be given. The fol- 
lowing simple construction furnishes a point p, which is point-wise closer than 
p to A: Clearly p ¢ H, for some j. Let p’ be symmetric to p with respect to the 
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boundary 1, of H,. If p; is on the segment joining p to p’ and p; ¥ p, p’, then 
clearly if a € H, then 
|p — a| > |p: — a 


holds. As A C H;, we see that (1.1) is verified, i.e. p; is point-wise closer than p 
to A. The numerical “construction” of p; is easily done as follows. Let g be the 
projection of p on the hyperplane z,, choose a number A such that 0 < A < 2 
and set 


bi =P +A — P). 


In passing from p to p:, the point-wise approach to A would seem to be 
strongest if among the H,, not containing p, we select the one which is furthest 
away from p. If A = 2 then p; = p’ and then (1.1) again holds, with the excep- 
tion that we have the equality sign for the points of A which are on the boun- 
dary of H,, if such points exist. 

These remarks suggest the following systematic search for a point of A: 
Choose a point at will. If p € A, i.e. its coordinates satisfy (1.2), our quest has 
ended. If p ¢ A, let H, be such that 


(1.5) dist (p, H,) = max dist (p, H;), 
where dist denotes euclidean distance. Let g be such that 
(1.6) q¢€ H; |p —q| = dist (p, H,). 

If \ is a constant, 0 < A < 2, we define 

(1.7) Pi = P+ ACG — P). 

For convenience we abbreviate this construction by writing 
(1.8) bi = F,(p). 


where F,(p), defined for p ¢ A, has been made single-valued by some pre- 
assigned rule for choosing j in case that (1.5) should not define j uniquely.' If 
p:1 € A our process has terminated. If », ¢ A we iterate (1.8), obtaining 


pe = Fy(p1), 


and we continue in like manner deriving a sequence of points p = po, p1, po, ... 
all outside A and connected by the relation 


(1.9) Pv+t = Fy (p>) (v = 0, l, ee ome 


There are two alternatives: (1) The process terminates after N steps with a 
point py € A; (2) The process continues indefinitely producing an infinite 
sequence {p,}. 


2. Statement of the main theorems. S. Agmon (1) has recently shown that 
if 0 < \ < 2 and the sequence {),} is infinite, then p, converges, as » > ©, 


1For instance the smallest j satisfying (1.5). 
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to a point on the boundary of A. We give a new proof of this result (Theorem 1, 
Case 1, and Theorem 2, Case 1, below). Our main contribution, however, is the 
investigation of the case when A = 2. Throughout this paper we denote by r 
the dimensionality of the polytope A defined by (1.4). As we assume that A is 
not void, r may have any value from zero to n. We denote by L, the r-flat 
which contains A. 


THEOREM 1. We assume thatr = n, i.e. A is not contained in any hyperplane of 


E,,. Let {p,} be a sequence of points obtained by the process described in § 1. There 
are two cases: 


Case 1. If 0 < \ < 2 then either { p,} terminates or else p, converges to a point 
1 on the boundary of A. 


Case 2. If } = 2 then the sequence {p,} always terminates. 


The formulation of our results for the case when r < n requires the following 
remarks concerning spherical surfaces. Let L, be a given r-flat in E,,0 <r 
< n — 1. Weare also given a point p,p¢L,. Let X be the locus of points x 
such that 
(1.10) lx — al = |p —al, for everya€ L,. 


We claim that X is a spherical surface S,_,-, of dimension n — r — 1. Thus if 
r = 0, then L, reduces to a single point a and X is evidently the S,_, with 
center at a passing through ». In the other extreme case when r = n — 1, 
L, is a hyperplane and the locus X contains exactly two points: the point p 
and its symmetric image with respect to L,. These two points form a S» 
located on the line through p which is normal to L,. A general proof of our 
assertion is as follows: Let b be the orthogonal projection of p onto L,. Erect 
at 6 the (m — r)-flat L’,_, which is normal to L,. Evidently p € L’,_,. Then 
x € X if and only if x € L’,_, and |x — b| = |p — b|. Indeed, assume that 
x € X. By (1.10), for a = b, we obtain that |x — | = |p — b|. This last 
equality and (1.10) show that for every a € L, the two triangles xba and pba 
are congruent. Since Z pba = 90° we conclude that Zxba = 90°. Hence the 
line joining 6 and x is normal to L, and x € L’,_,. Conversely, if x € L’,_, and 
|x — b| = |p — Od], let us show that x € X. This is now clear because the two 
triangles pba and xba (a € L,) are right-angled at b and have equal legs respec- 
tively. This implies (1.10), hence x € X. The locus X may accordingly be 
defined by the two conditions 


x€L',,, |x —b| =|p — dl, 


and is therefore seen to be identical with the spherical surface S,_,-; of L’,-, 
having its center at 6 and passing through p. We shall refer to S,_,-1 as a 
spherical surface having L, as its axis, for indeed, by (1.10), L, is precisely the 


locus of points a with the property of being equidistant from all points of 
} 
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THEOREM 2. We assume thatr <n, A C L,. Let {p,} be a sequence of points 
obtained by the process of § 1 


Case 1. If 0 < \ < 2, then {p,} either terminates or else p, converges to a 
point | of A. 


Case 2. If \ = 2, then {p,} either terminates or else there is a number vo such 
that the points p,, for v > vo, are on a spherical surface S,,, having L, as its 
axis. 


3. Remarks. (a) The procedure here described for finding a solution of 
(1.2) is called the relaxation method, especially if \ = 1, when it may also be 
called the projection method. We speak of under-relaxation or over-relaxation 
depending on whether 0 < A < 1 or 1 < A < 2. The case when A = 2 is an 
extreme case of over-relaxation which may also be called the reflexion method. 

(b) Theorem 1, Case 2, describes the main advantage of the reflexion method 
(A = 2). No other value of A, 0 < A < 2, has the property of always leading to 
a terminating sequence if r = n. If 0 < A < 1, this is easily shown by con- 
sideration of a triangle A in E>. For a \ with 1 < A < 2, an example of a non- 
terminating sequence in E> is constructed as follows (Fig. 1): Let ZpgO = 90° 


p / 


Pp, bag e 











and p, q, pi: be such that pp:/pq = A, hence pg > gp. Draw the ray Or such 
that ZpOr = ZqOp, and produce Or into a full line r’Or. Let A be the inter- 
section of the closed half-plane below q’g¢ and the closed half-plane above r’r. 
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Starting with » and iterating the process p; = F,(p), we obtain an infinite 
sequence of points { p,} which oscillate between the rays Op and Op,, converging 
to O. 

(c) Let \ = 2, r < m and let us suppose that the sequence {),} is infinite. 
By Theorem 2, Case 2, for vy > vo, all p, are on a S,_,-, having L, as its axis. 
Since p, and p,,,; are both on S,_,1, the hyperplane with respect to which p, 
and p,,; are symmetric to each other must contain the axis L, of S,_,1. We 
may state this result as 


Coro.iary 1. Letr < nand let the reflexion process (\ = 2) lead to an infinite 
sequence |p,}. Then there exists an integer vo such that all the hyperplanes 


(1.11) rs: 2D aigts + by = 0 
j= 


which are actually used in the reflexion process for v > vo contain the entire poly- 
tope A and therefore also the r-flat L, which contains A. Any such hyperplane, or 
combination of such independent hyperplanes, may therefore be used to reduce the 
problem to one of a dimension less than n. 


(d) When the inequalities (1.2) are all homogeneous, we wish to find a point 
of A distinct from its vertex 0. The relaxation method may well lead to the 
trivial solution o, if 0 < \’ < 2. Thus if 1 < A < 2 and if A is the “cone” in 
E, of Fig. 1, we have the infinite sequence {p,} converging to 0. By Theorem 1, 
Case 2, and Theorem 2, Case 2, this can never happen if A = 2. 

In II and III we prove the Theorems 1 and 2 respectively. In IV we discuss 
the behavior of the reflexion process for a special kind of infinite family of 
half-spaces, namely all half-spaces of support of a bounded and closed convex 
set in E,. A study of this problem, suggested by our previous discussion, seems 
justified by its own geometric interest. 


Il. A Proor or THEOREM 1 


4. On Fejér-monotone sequences of points. Let A be defined by (1.4) and 


let Go, 91, 92, . . . be an infinite sequence of points outside A with the following 
properties: 

(2.1) qi ~ Vit» 

(2.2) la: — a| > \quss — al, for all a € pt OE Bycae 


The sequence {qg,} is approaching the set A point-wise and we summarize this 
situation by saying that the sequence {q,} is Fejér-monotone with respect to A. 
Concerning such sequences we prove 


Lemma 1. Let the sequence {q,} be Fejér-monotone with respect to the polytope 
A, assumed to be of dimension r. 

Case 1. If r = n then the sequence {q,} converges to a point. 

Case 2. Ifr < n then the sequence {q,} either converges to a point or else the set 
of its limit points lies on a spherical surface S,.,1 whose axis is the r-flat L, 
spanned by A. 
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Proof. Case 1. Let r = nm and consider the spherical surfaces 
(2.3) S’(a) : |x — al = |g, —a| (a€ A,vy =0,1,...). 


By (2.2) the surface S’(a@) is non-expanding as its center a is kept fixed and v 
increases. Therefore the following limits exist 


(2.4) lim |g, — a| = R(a) (a€A). 
Define the surface 
(2.5) S(a) :|x — a| = R(a) (a € A), 
and let us consider the set 
(2.6) X= fl S(e). 

atA 


(a) Every limit point l of the sequence {q,} is in X. Indeed by (2.4) we see 
that 1 € S(a), for every a € A, hence /€ X, by (2.6). We conclude that X is 
not void because the bounded sequence {q,} has at least one limit point. 

(b) The set X contains exactly one point |. Indeed, if1 # Il’, 1€ X, l’ € X, 
then let x denote the hyperplane of points equidistant from / and /’. If a€ A 
then /, /’ being both in X, are also both in S(a). Hence a € x and we conclude 
that A C z in contradiction to our assumption that r = n. 

(c) We conclude that q, — l. Indeed, by (a) and (b) we see that / = X is the 
only limit point of the sequence {q,}. 


CasE 2. Let r <n, A C L,. If the sequence {g,} converges to a point then 
there is nothing to prove. Let us assume that 


(2.7) the sequence {q,} does not converge to a point. 


In any case the bounded sequence {q,} has limit points and let p be one of them. 
Define as before the spheres S’(a), S(a) and the set X by (2.3), (2.4), (2.5) 
and (2.6). By (2.4) we have that R(a) = |p — a| (a € A). The set X is there- 
fore identical with the set of points x such that 


lx — al = |p — al, for every a € A. 
Since A spans L, we may also define X as the locus of points x such that 
(2.8) lx — al = |p — al, for every a € L,. 


As shown in § 2, (2.8) defines a S,_,:, provided that the locus does not 
reduce to a point of L,. Since this locus contains all limit points of {q¢,}, our 
assumption (2.7) excludes this possibility and (2.8) defined a spherical surface 
S,—1-1. This completes a proof of Lemma 1. 


5. Proof of Theorem 1, Case 1. Here r = m; assume the sequence {p,} to 
be infinite. As already mentioned in § 1, the sequence {p,} is Fejér-monotone 
with respect to A. By Lemma 1, Case 1, the sequence {p,} converges to a 
point /: 

(2.9) lim p, = lL. 
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We have to show that / € A. For this purpose we introduce the function 


(2.10) d(x) = max dist (x, H;). 
Observe that d(x) is everywhere continuous and that 


(=0, x€ A, 
(2.11) d(x) + 
|>0, x ¢A. 


Now (2.9) implies that |p,, — p,| — 0 and therefore also that 





i, 
d(p,) = Y |\Pr+i rates Pr»! — 0. 
By the continuity of d(x) we have 
d(l) = limd(p,) = 0 


and therefore / € A, by (2.11). The point /, being a limit of exterior points p,, 
must be on the boundary of A. 


6. Proof of Theorem 1, Case 2. Let \ = 2,7 = n, and let us show that the 
sequence {p,} must terminate. Indeed, suppose it were infinite. By Lemma 1, 
Case 1, we again conclude that 
(2.12) lim p, = 1 


and the argument used in the previous paragraph shows that / is on the 
boundary of A. Let ,4; be obtained from p, by reflexion in the boundary 7,, 
of the half-space H,,. By (2.12) it is clear that 


(2.13) lim dist (J, +,,) = 0. 


¥—cp 


The given family of hyperplanes (1.11) being finite, we conclude from (2.13) 
that 


dist (/, x,,) = 0, 
provided »v > vo, hence 
LE Fj, vy > v. 


This conclusion, however, contradicts (2.12). Indeed, every point p, (v > vo) 
is obtained from the preceding point p,_, by reflexion in a hyperplane through 
l. All these points must therefore lie on the spherical surface 


|x — | = |p.. — | (> 0) 
and can therefore never converge to /, as (2.12) requires. 


III. A Proor oF THEOREM 2 


7. Proof of Theorem 2, Case 1. We assume r < n, 0 < \ < 2, and that the 
relaxation process (1.9) furnishes an infinite sequence {p,}. We are to show that 








400 T. S. MOTZKIN AND I. J. SCHOENBERG 


p, converges to a point / of A. If the sequence {p,} converges to a point / then 
the argument of § 5 shows that / € A and we are through. Let us now assume 
that 

(3.1) the sequence {p,} does not converge 


and show that we shall reach a contradiction. By Lemma 1, Case 2, and (3.1), 
we conclude that all the limit points of {p,} are on a spherical surface S,_,_, 
having L, as an axis. 


(a) Our assumptions imply that 
(3.2) inf |pr41 — P| =e > 0. 


Indeed, consider the function d(x), defined by (2.10), for the points x on the 
surface S,_,-:. Since A (\ S,_,-1 = 0, we conclude that 


d(x) > 0, if x € S,_,-1. 
Since d(x) is continuous and S,_,_; is compact, we conclude that 
y = min d(x) > 0, <€ $1. 


Let us select 7; fixed such that 0 < y; < y. Let 6 be positive and let NV; denote 
the set of points x defined by 


dist (x, S,-,-1) < 6. 
Again by the continuity of d(x) we can select 5 so small that 
(3.3) d(x) > ¥1, x€ Nj. 


By Lemma 1, Case 2, we have »,€ N3, provided v > vo. Now (3.3) implies 
that 
\Prs1 — P| = Ad(p,) > AN, 


provided v > vo. This proves (3.2) with c > Avy: > 0. 

We may now easily show that the assumption (3.1) leads to a contradiction. 
Indeed, let the point a on S,_,_; be a limit point of {p,}. For an appropriate 
subsequence {»’} of the sequence of all integers {vy} we have 


py aE S71. 
For a subsequence {v’’} of {v’} we may also assume that 
Pwr >a, and pyr4, > BE S,_,-1. 
By (3.2) we conclude that a ¥ 8; in fact 
la _ | >c>0. 


Select on the line through a and 8 a point n such that 8 — a = A(n — a), and 
notice, because of 0 < \ < 2, that 7 is nearer to 8 than to a: 


(3.4) ln —a| > |n — A. 
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For the subsequence {v»’’}, the half-spaces H,, used in obtaining p,-4, from 
py, must converge to the half-space 


(3.5) H : |x — al? — |x — B|? > |n — al? — |» — Bl; 


in fact H, must already be identical with H, for sufficiently large v’’, because 
of the finiteness of the number of half-spaces H,;. This, however, leads to a 
contradiction, for on the one hand A C H, implies that A C H. Hence x € A 
implies x € H and therefore by (3.5) and (3.4) 


lx — al? — |x — Bl? > |» — a|? — |» — Bl? > 0, 
or 
(3.6) lx — al > |x — Bl. 


On the other hand, A being on the axis of S,_,1, we must have in A the 
equality |x — a| = |x — 8| in contradiction to (3.6). Thus our assumption 
(3.1) is untenable and the proof is completed. 


8. Proof of Theorem 2, Case 2. We assume r < n, \ = 2, and that the 
reflexion process produces an infinite sequence {p,}. This sequence cannot 
possibly converge, for its limit / would belong to A (§ 5) and would then have 
to terminate (§ 6). By Lemma 1, Case 2, the only alternative is that the 
sequence {p,} converges to a spherical surface S,_,_; of axis L,. Then (3.2) or 
(3.7) inf |p. — p-| > 0 


again holds. Out of A select r + 1 fixed points do, a;,...,@, spanning L, and 
let x;, be the reflecting hyperplane used in obtaining p,,; as the point sym- 
metric to p,. Since all limit points of p, are on S,_,~:, it is clear that 


lim ja, — p,| = lim jay — Pryi| = R(a,) (k = 0,...,7). 


This, together with (3.7), shows that 


lim dist (a;, 7;,) = 0 (k = 0,...,7). 


vp 


By the finiteness of our supply of reflecting hyperplanes we conclude from the 
last relations that 
a, € ®;,, y>vm:;k =0,...,7; 


or what amounts to the same thing: 
L,C #3, v > vr. 


In other words: there is a number vo such that all reflexions for v > v» are 
performed with respect to hyperplanes 7,, which contain the axis L, of S,_,—1. 
This, however, requires that 


P» E Sonpnts Vv > Vo. 
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Indeed, if p,, ¢.S,—,-1, then all p, (v > vo) would lie on a surface S’,_,_; of 
axis L,, passing through p,,, and could then not converge to S,_,1, as we 
assumed. 


IV. THE REFLEXION PROCESS WITH RESPECT TO A CONVEX DOMAIN 


9. Statement of the problem. We have so far discussed the behavior of the 
reflexion process with respect to a finite family {H} of half-spaces in E,. Do 
the results obtained extend to infinite classes of half-spaces? We deal here only 
with the following special case of this problem: Let A be a given closed and 
bounded convex set in E,. A closed half-space H belongs to the family F if and 
only if the boundary of H is a hyperplane of support of A and A C H. Let 
pb ¢ A and let g be the point of A which is nearest to p. Let ro be the hyperplane 
through g which is normal to the segment joining p and gq and let Hy be the 
closed half-space, bounded by mo, which does not contain p. Evidently Hy © F; 
also 


dist (p, Ho) = max dist (p, H) 
Her 


Indeed, if there were a H € F such that dist (p, H) > dist (p, Ho) = |p — qi, 
then g ¢ H, in contradiction to the fact that g € A = (\H. This shows that the 


reflexion process with respect to the family F = {H} amounts to the construc- 
tion of the point 
(3.1) pi = P+ 2(q — Pp) = F(p) (p ¢ A). 


Let us call p; = F(p) the image of p with respect to A. 
If p; ¢ A we may form p2 = F(p;) and continue in like manner obtaining a 
sequence of points p = Po, p:, Po, . . . connected by the relation 


(3.2) Pri = F(p,) (vy =0,1,...). 


We have again the old alternative: (1) The process terminates after N steps 
with py € A; (2) The process continues indefinitely, producing an infinite 
sequence {p,}. 


10. The main result. The behavior of the reflexion process with respect to 
A is described by the following 


THEOREM 3. Let A be a closed convex and bounded set of E, of dimension r 


and let L, be the r-flat containing A. Suppose po ¢ A and let {p,} be the sequence 
obtained by the reflexion process (3.1), (3.2). 


Case 1. If r = n, then the process always terminates. 


Case 2. Letr <n. If po © L, then the process terminates. If po ¢ L, then the 
process produces an infinite sequence | p,} with the following property. There is a 
number vo such thai for all vy > vo the points p, oscillate between two points which 
are symmetric with respect to L,. 
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Proof. CAsE 1. Let r = m and let us assume, to obtain a contradiction, that 
the sequence {p,} is infinite. We know that {),} is a Fejér-monotone sequence 
and hence that it converges to a point a by Lemma 1, Case 1. Clearly a € A, 
and hence a is on the boundary of A, by the argument of § 5. Let P be 
the projection cone of A at the point a, i.e. the intersection of all H whose 
boundary hyperplanes pass through a. The cone P is convex and of dimension 
n, since P > A. There is therefore a half-space H € F, whose boundary 
supports P (and also A) at the point a and whose interior normal ai, at a, is 
wholly interior to P, except for the point a. Let us think of x as horizontal and 
its normal ai as pointing vertically downward. The point i being interior to P 
also a certain small spherical neighborhood S of i is in P. Let us call C the slim 
circular cone, of vertex a and axis ai, which is circumscribed to S. This convex 
cone C is wholly in P. 

Let us denote by Q the convex cone of vertex a which is generated by the 
interior normals of all H supporting P (or A) at the point a. These H support 
also C at a. It follows that the closed convex cone Q (called the polar cone of A 
at a) has only the point a in common with z, Q being below x. Q was defined as 
the locus of all interior normals of A at a. Let us denote by N a small neigh- 
borhood of a on the boundary of A. Let Q’ be a given closed and convex cone 
satisfying the following conditions: (i) Every ray of Q is interior to Q’, (ii) Q’ 
has only the point a in common with z. Clearly, a neighborhood N of a exists 
such that the interior normals to A at the points of N, if transferred parallel to 
themselves so as to start at a, will all lie in Q’. In fact otherwise we could find 
normals at points converging to A whose limit (which obviously being a normal 
at A) would be outside or on the boundary of Q’. 

We now return to the sequence of points {p,} which converges to a. Let g, 
be the midpoint of the segment p,p,,,. It is clear by our construction that 
q, € boundary of A and that the vectors g,p,,; are interior normals to A. 
Also p, — a implies that g, — a, hence g, € N, provided v > vo. This leads to 
a contradiction. Indeed, consider the sequence of points 


Pres Proet+ts “eee 


We know from what was said above that the vectors ¢,P,4:, if transferred to a, 
lie in Q’. That means that the vectors 


PoP v41 (y > Vo) 


have a positive component in the direction of the downward vertical vector 
ai. Since p, — a, we conclude that all points p,(v > vo) are above the horizontal 
plane x. But this implies that also g, are above z. This, however, is absurd 
since g, € A and A is below z. 


CasE 2. If r < mand po € L, then again the process must terminate by the 
previous case because we have no occasion to leave L, in the course of our 
process. 
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We now assume that r < n while po ¢ L,. Let L,,; be the (r + 1)-flat con- 
taining L, and po. Note that we never leave L,,, which amounts to assuming 
at the start that r = n — 1. Let, therefore, A be (m — 1)-dimensional, ACL,_, 
and po ¢ L,_1. Let p’o and p’; be the projections of » and f,, respectively, on 
L,-1. It should be clear that a point go of A is nearest to pp» if and only if it is 
nearest to p’». It follows that p, = F(po) implies that p’; = F(p’o). Consider 
the sequence of reflexions {,}. It is clear that the distance from p, to L,_, has 
the same positive value dist (po, L,-:), the points p, passing from one side of 
the plane to the other alternately. However, the sequence {p’,} of their pro- 
jections on L,_, has the property p’,,,: = F(p’,). By the previous case this 
sequence “terminates” with a first p’y € A. From that moment onwards the 
sequence p, must oscillate between two points on the normal to L,_; at the 
point p’y = p’w41 =.... 
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ON SOME RECENT DEVELOPMENTS 
IN THE THEORY OF SERIES 


M. S. MACPHAIL 


In a number of recent papers, especially by Wilansky (4; 6), Zeller (8), and 
Peyerimhoff (3), the sequence-to-sequence transformation 


A: Yn = >, Guate (xn = 0,1,...) 


k=0 


has been studied under certain conditions, designated by FAK, PMI, etc. 
(see §3). The purpose of this note is to point out some relations among these 
conditions, and to show that some theorems previously obtained hold under 
weaker assumptions. 

We begin with a remark concerning Mazur’s well-known consistency 
theorem. This has been proved by several authors, (Mazur (2, Theorem 7), 
Banach (1, p. 95, Theorem 12), Wilansky (4, Theorem 3.3.1), Peyerimhoff 
(3, Theorem 4.2)) under various conditions and by various methods of proof. 
We give here a simple direct proof of the theorem, using only the assumptions 
that A is co-regular and reversible, as defined below. 


Notation. We shall assume throughout the paper that A satisfies the “‘row- 
norm”’ condition: there is a constant M such that Lalana| <M (n=0,1,...). 
We denote the column limits of A by 


a, = lim az ‘ & Ses * 

the row sums by re 
Gn = >, On (2 = 0,1,...), 

k 


and we put 


a=lima, pa =a-—- > a. 
n r 


If A limits every convergent sequence, or equivalently, if a,, a exist, A is called 
conservative; if p, ~* 0, A is co-regular, while ifa = 1,a, = 0 (k = 0,1,...) so 
that lim, y, = lim, x, for each convergent sequence {x,}, A is regular. Similarly 
we denote the column limits of a matrix B by }, and so on. We define 


s* = {os} = {0,0,...,0,1,0,...}, 


and a* = {a,a,...} for any real number a. We denote the set of all sequences 
5* by A, and the same with 1* adjoined, by ®. The set of all sequences {x,} for 
which {y,} converges is denoted by (A), and the set of all {x,} such that 
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Yn — 0 by (A)o. A summability method is reversible if to each convergent se- 
quence {y,} there corresponds a unique sequence {x,}. It is well known that if 
A is reversible, (A) and (A)» are Banach spaces under the norm 


[|x|] = sups| Do Gnaxel, 
k 
and that the general continuous linear functional (c.I.f.) on (A) is given by 
(1) f(x) = tA(x) + 2 4A,(x), 


where A,(x) = LpGneXz, A(x) = lim, A,(x) = A-lim-~x., and z\7,| <o, If 
two methods A, B agree on (A) (\ (B), they are called consistent. Evidently 
if A is conservative and (B) D (A), then B is also conservative. 


THEOREM 1. Let A be reversible and co-regular. Then in order that A be consis- 
tent with every method B such that 


(B) D (A) and B(x) = A(x) on ®, 


it is necessary and sufficient that for any sequence {t,}, 


D In| < @ 


M LX htm = 0 (k= 0,1,...)| 


imply t, = 0 (n =0,1,...). 


Proof. Every method B with (B) > (A) represents a c.l.f. on (A), since 
each x, is a c.1.f. (1, p. 47) and therefore so is lim, 2,5,, x, (1, p. 23, Theorem 
4). Conversely every c.l.f. on (A) can be represented by a matrix method B 
with (B) D (A), for example with f(x) as in (1) we may let b,, = todo + han 
+... + ty-10n-1,e + fx. Hence, for A to have the property stated it is 
necessary and sufficient that every c.l.f. which vanishes on ® should vanish 
throughout (A), that is, 


D || < @ | v (n = 0,1,...) 
Mi tar + >> alndm = 0 (k = 0,1,...)} imply | 
ta + Dd ntnOn = ( = 0 





4 


But M, is equivalent to M. For we have by absolute convergence, 
p> p> bin = > p> tOnk = > tnOtn- 
Hence the left-hand side of M;, is equivalent to 
DX || < ©, 
tay + >> atndne = 0 (k = 0,1,...), 
tla — > x) = 0. 


Since a — 2,a, ~ 0, the assertion now follows. This proves the theorem. 
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Several theorems previously stated for normal matrices (that is, triangular 
with non-zero diagonal terms) can easily be extended to reversible matrices. 
We give one example (compare Peyerimhoff, (3, Theorem 4.4)). It is known 
(1, p. 50) that if y, = 2,a,.x, with A reversible, there exist constants C¢,, Cy» 
with 2,|c,,| << © for each k, such that for each convergent sequence {y,} we have 


(2) x, = & lim y, + \ CrpYp- 
p P 


THEOREM 2. Let A be reversible and let x, be represented as in (2). If the matrix 
(Cxp) has bounded columns, then M holds. 

Proof. Assume z\¢,| < ©, TatrOne = O(k = 0,1,...). Then Dyce, Lyt,@,, =0. 
By absolute convergence we have Zyt, Ddarcepy = 0. Now by a lemma of 
Wilansky (5, Lemma 3), we have 2,4,.¢,, = 6,". Hence t, = 0 (p = 0,1,...) 
and so M holds. 


We now state the conditions referred to in the introduction. If Da,x, con- 
verges for each x € (A), the matrix A is said to have maximal inset (6, p. 648). 
If every matrix B with (B) = (A) has maximal inset, A has the property of 
propagation of maximal inset (briefly, A has PMI). 

The conditions of Zeller which will next be stated were defined for elements 
of any FK-space E (7; 8), but in the present paper E will be one or other of the 
Banach spaces (A), (A)o. For a given x = {x,} € E, the rth segment (Ab- 
schnitt) is the sequence 


x” == {xo,x1,...,X,p,0,0,...}. 


The property AK (Abschnittskonvergenz) is that for a given x we have 
x‘ —»x or equivalently =x, 6* = x. If this holds for each x € E, then E is 
said to have AK, which is equivalent to A being a basis for E (1, p.110). It is 
known (8, Beispiel 4.2) that if C, is the Cesaro method of order 1, (C)o has 
AK. Similarly SAK (schwache Abschnittskonvergenz) means f(x‘) — f(x) 
for each f defined on E, or equivalently =x,f(s*) = f(x), FAK (funktionale 
Abschnittskonvergenz) that =x,f(6*) converges for each f, not necessarily to 
f(x), and AD (Abschnittsdichte) that x is a limit point of the set of all seg- 
ments. For E to have AD it is necessary and sufficient that A be fundamental 
in E (1, p. 58). 

As for the relations among these conditions, we have obviously the logical 
implications AK — SAK — FAK, and by a standard theorem on weak con- 
vergence (1, p. 134), SAK — AD. It has been proved by Wilansky (6, Lemma 
16) that if A is reversible, co-regular and has PMI, then (A) has FAK. We 
shall show that by modifying the proof we may reduce the assumption that A 
is co-regular and arrive at the following result. 


THEOREM 3. Let A be a reversible, conservative matrix. Then A has PMI if and 
only if (A) has FAK. 

Proof. (a) Since every matrix B with (B) = (A) represents a c.I.f. B(x) on 
(A), with B(é*) = Bb, it is obvious that FAK — PMI. 
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(b) Let A have PMI, and let f(x) = tA(x) + 2yt,A,(x) be ac.l.f. on (A). 
If 6, @ denote the solutions of y = Ax when y equals 1*, 5" respectively, we 
have 


= fO"), t=t(f) =f0) -— DSO). 
It is well known that if ¢ # 0, the corresponding matrix B (see the proof of 
Theorem 1) has (B) = (A). (Indeed B = TA, where T is the matrix whose nth 
row is (to, t1,.. . , f-1, t, 0, 0,...) and it can be shown (6, Lemma 1) that T 
sums only convergent sequences.) We then have at once from PMI that 
=x,f(s*) = Tdyx, converges. If however ¢ = 0, we define g(x) = A(x) + f(x) 
on (A). Then 
g(0) = 1+ /(), @) =/@), 

so t(g) = g(@) — Zg(6") + 0, and Yx,g(d*) converges. But g(é*) = a, + f(&*) 


and so 
Do xf (8) = De mg) — 2 aare 
converges. Hence (A) has FAK. 


THEOREM 4. Let A be reversible and co-regular. Then A has PMI if and only if 
® is a basis for (A). 


This is proved by Wilansky (6, p. 650) under the assumption that A is 
normal. But an examination of the proof shows that this is introduced only 
because at a certain point it is shown that A satisfies condition M, and one 
wishes to conclude that ® is fundamental in (A). Theorem 1 shows that 
reversibility is sufficient for this. 


THEOREM 5. Let A be reversible and regular. Then (A), has AK if and only if 
(A)o lor equivalently (A)| has FAK. 


Proof. Let (A)o have AK; then by a general implication already mentioned, 
(A)» has FAK, whence by an easy deduction (8, Beispiel 4.4), (A) has FAK. 
Conversely, let (A) have FAK. Then by Theorem 3, A has PMI, and by 
Theorem 4, @is a basis for (A). But A C (A)»o and 1* ¢ (A)o, hence Aisa 
basis for (A)o, and (A)» has AK. 

Remark. It is shown by Zeller (8, Theorem 3.4) that for any FK-space E£, 
FAK and AD together imply AK. Theorem 5 shows that for certain spaces 
AD can be dropped. 

The relation between M and PMI for a regular reversible method can be 
summarized: M means that A is fundamental in (A)o, or @ in (A); PMI that 
A is a basis for (A)o, or ® for (A). 


THEOREM 6. Let A be reversible, regular, and have PMI. Then for any matrix B 
with (B) D (A) we have the representation 


(3) B(x) = ppA(x) + DO dex, 
valid for each x € (A). 
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Proof. By Theorem 5, (A)» has AK and therefore SAK. Now B(x) is a c.L.f. 
on (A)o and so B(x) = =x,B(s*) = b,x, for x € (A)o. For any x € (A) with 
A(x) = o, we write x = o* + (x — o*), with x — o* € (A)o. Then 


B(x) 


(o) + Blix —o') 
Bo + } by (xz — @) 
pao + Zz. DpXe, 


which proves the theorem. 


If B is co-regular a simpler argument, based on the matrix (1/pg) (On, — dy), 
suffices. The condition PMI is obviously necessary, as without it (3) would not 


be defined even for all B with (B) = (A). 
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ON BESSEL POLYNOMIALS 
R. P. AGARWAL 


1. Introduction. Recently a number of papers have been written on Bessel 
polynomials which arise as the solutions of the classical wave equation in 
spherical coordinates. Krall and Frink (5) studied in some detail the properties 
of these polynomials y, (x, a, 6) defined as 


(1) n(x, a,b) = oF)(—n,a + n — 1; —x/b). 


In particular, they gave the differential equation, mth differential formula, 
orthogonal property and recurrence relations. The special case y,(x) of these 
polynomials, obtained by taking a = 6 = 2, was studied in greater detail than 
the generalised polynomials. Besides the above properties, a generating func- 
tion was also derived for y,(x). It was also shown that y,(x) are closely con- 
nected with Bessel functions of half-integral order. 

Later, Burchnall (2) identified these polynomials by certain other poly- 
nomials studied by him many years back (3). By the help of certain differen- 
tial operators he filled in certain gaps in Krall and Frink’s work and also gave 
simple proofs of some of their formulae. He derived a generating function for 
¥n(x,a,6) and gave some properties regarding the zeros of the polynomials 
¥n(x), which were not given by Krall and Frink. 

Very recently Rainville (6) and Brafman (1) have given some generating func- 
tions for these polynomials. The generating functions given by Rainville were 
very general and one of them includes Burchnall’s result as a particular case. 

But it seems rather surprising that none of the above authors seem to have 
noticed that these polynomials y, (x, a, b) are merely a special limiting case of 
the classical Jacobi polynomials P,,“. ® (x). In this paper I use this definition 
and derive most of the results given by Krall and Frink and Burchnall as 
simple limiting cases of known results for these polynomials. 

Since Jacobi polynomials have been extensively studied, one can find many 
more interesting properties for y,(x, a, b) simply as limiting cases. 

In §9 I give certain simple expansions involving these polynomials to indi- 
cate the possibility of getting more results of this type. 

I conclude the paper by giving a very simple relationship between y, (x, a, 5) 
and the Whittaker’s function W;,, (x). I am grateful to Professor Bailey for 
pointing out this interesting relationship. 


2. Definition. We know that (7, 4.21.2) 


Pi" (x) = ree D 1F(—n, n+a+6+1;a+1; 1= 3) ; 
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Thus from the definition (1) of y,(x, a, 6) it follows easily that 


= fn F@FEV LO pete» ( a). 
(2) yn(x, a,b) = lim —"T( Te +n) P, 1+ : 


3. The differential equation. The differential equation satisfied by P,“-* (x) 
is known to be (7, 4.2.1) 


2 
(1 — x*) f+ [8 —a—(a@+8+2)s] J n(n tatBs+i) y= 
Changing the variables by putting 
1+ = for x, a=e-—land B=a-—-e-1 


and using (2) we get on taking the limit that the differential equation satisfied 
by yn (x, a, 5) is 


(3) oh 4 ax +s) 2 = n(n+a-—1) y. 
This is the equation given by Krall and Frink (5, equation 2). 
4. The nth differential formula. It is known that (7, 4.3.1) 





™ 1 — = 4 —# 2 , e+e n+ 
P, ' ” (x) - ( ut x) (—) = {(1 — £) (1 + x) 4" 


Using the definition (2) we get 


si (=)" Tm + 1) -. =e aa) (: a (4) 
Yn(x,a,b) = = n! lim I'(e a4 n) b : b 2e 


a =r 
x S\(- b 


.2n+a—2 co) 
’ 


_ + 
no 
+ 
>| 
lk 
—< 
; 
~ 
4 
LR 
ae 


This on simplification gives 
(4) n(x, a,b) = 0" x” on Fy 
which gives Krall and Frink’s result (5, equation 47). 

5. Recurrence formulae. We can easily deduce from the known results 
(7; 4.5.1, 4.5.7 (i) and (ii) and 4.5.4 (ii)) the following four recurrence for- 
mulae for y, (x, a, 5): 

(5) (m + a — 1)(2m + a — 2) Yugi = [(2n + a)(2n + — 2) x/d 
+ a — 2](2n + a — 1) yn, + m(2n + a) Yp-1, 


(6) x*(2n +a — 2) Ss [n(2n + a — 2) x — mb] yn + mbyn-1, 
(7) (2n + a) x* = (1 — m — a)[(2n + a)x +d], 


+ b(n + a — 1) Yass, 


(8) (2m + a) x yn(x,a + 1,5) = dlynss — Ya). 
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The first two recurrence formulae were given by Krall and Frink (5, 51) but 


the last two are new. It may be remarked that the proof of (6) above as given 
by Krall and Frink was especially long. 


6. The generating function. The generating function for the Jacobi poly- 
nomials is (7, 4.4.5) 


> P, (se) ay" 
= 2°71 — Qew + w’) {1 — w + (1 — 2xw + w’)!}* 
x {1+ w+ (1 — 2ew + w*)}*, 
valid for sufficiently small values of |w|. Putting 
w= }tb/egea=e—1B=-a—e—1,x = 1 + 2ex/d, 
and using the definition (2) we get, on taking the limit, 


(9) ieee — 2xt)*]?*(1 — 2xt) exp [{1 — (1 — 2xt)*} b/2x] 


= >> (4b)"y,(x, a, b) t"/n!. 


n=0 
This gives the generating function for y, (x, a, 6) given by Burchnall (2, §6). 
From this by obvious substitutions we can get the pseudo-generating function 
for y,(x, a, 6) given by Burchnall [2, §6 (24)] viz, 


(1 — u)**e’™ _ eo [bu(l — uu)!" , 
i-% —- > n! , 


Yn(x~*, a, b). 


For a = b = 2, (9) gives the generating function for y,(x) given by Krall and 
Frink (5, 25). 


7. A contour integral representation. An immediate consequence of the 
generating function (9) not noted by the above authors is the following contour 
integral representation for the polynomials y,(x, a, b). If a is an integer (to 
make the integrand single-valued), we have 


(30)" 


n! 





n(x, a, b) 


3 f "4 — 4(1 — 2xt)*}?* (1 — 2xt) exp [{1 — (1 — 2xt)*} b/2x]dt, 
2771 « c 


or 





= ee n! ‘ad a ne bu /z 
Yn(*, a, 6) = 2 aa J, wd—-uyne 


where C is a simple closed curve round the origin, small enough not to pass 
through the point u = 1, and a is an integer. 





} 
\ 
| 
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8. Orthogonal property and properties of zeros. Since the polynomials 
n(x, a, 6) are orthogonal over a unit circle and the Jacobi polynomials over the 
real interval (—1, 1), the limiting process does not help and we may have to 
resort to other methods (e.g. 7, 11.5). But it seems to be more complicated 
than the classical method given by Krall and Frink and hence there seems to 
be no interest in going into the details. Once knowing the differential equation 
satisfied by y,(x, a, b) we can easily apply Krall and Frink’s method. 

Further, we can also deduce some of the properties of the zeros of these 


polynomials given by Burchnall, e.g., it is known that if x,(v = 1, 2,...,) 
are the zeros of P,@"(x) (7, ex. 14, p. 370) then 
To, « eS). 


On tat Bp’ 
Let the zeros of 


pp + Qex /b) 


be denoted by X,(v = 1, 2,..., ); then it is obvious that 
” — b nb| a — 2e— 1 
2x, om de @-1)= nb | a — de -2 | 


Now let {£,} be the zeros of y,(x, a, b); then obviously 


> 6, @ fim 2. « — 4b. 


vel tam vel 
Hence, we have 
(10) Lt = — }b, 
where é,(v = 1, 2,...,) are the zeros of y,(x, a, 5). 


This generalizes the result of Burchnall, who deduced that the sum of zeros 
of y, (x) is to equal to —1. 
It is probable that one might find other properties of the zeros of y, (x, a, ) 


as limiting properties of the zeros of P,® (x), but I have not gone into the 
detail of that aspect. 


9. Some expansions involving y,(x,a, 5). We can get some interesting ex- 
pansions for these polynomials from known expansions of Burchnall and 


Chaundy (4). I mention some of them, omitting the very simple proofs. 
From (4, 30) we get 





(11) (x, b +8 + 1,c) = Be + MO )e (x /e)*y, (x, b+2r-+1, c) 


r=( 


X Ya-r(x, b’ — nm + 27 + 1,0), 


which may be taken as a quasi-addition theorem for the parameters 6 and 3b’. 
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Similarly, from (4, 38), we get 
(12) yon(2x, b + 1, «i (—2n)s 0 + 2n), (:)" 


r=0 





x Von—27(X, 2b + 2n + 4r a l, c); 


this may be taken as a duplication-theorem for these polynomials. 

In the same way one can write down many other expansions from Burchnall 
and Chaundy, and other known expansions involving ordinary hypergeometric 
functions. 


10. The relationship of y,(x,a,) with Whittaker’s function W,,,,(x). It 
is known that (8, §16.3) 


We.m(x) wee hI + } (—)'(m — k 7 ae —k+ Del 


for large values of |x| when |arg x| < x — a. Putting m = —k + 3 + n, where 
n is a positive integer, we get 





~yz ewe (—)?(—2n),(1 — 2k + 2) 
Wr—mine(s) = tt 


Put k = 1 — 4a. Then 


W1-40.40-440(%) = € x" Fy(—n, a + 0 — 1, —1/x). 


Hence 
(13) Ya(x, a,b) = & 8? (b/x) Wige, joy (b/x) « 


From this definition and from the known properties of Whittaker’s function 
which has been widely studied one can deduce all the important properties of 
¥n(x, a,b) together with many new ones. This definition has the advantage 
that it avoids the limiting process given in §2. 

It easily follows from (13) and the integral (8, §16.12) 





—}z_k re) 
Wi.m(x) = rG = or m) Je cer + t/x) edt (R(k — 3 — m) <0), 
that 
1 ¥ a—2+n /L\2.— 
(14) Yn(x, a,b) = poten J t (1 + tx/b)"e™ ‘dt, 


valid for R(a +n — 1) > 0. 


In a similar manner we can find other new properties for y, (x, a, }). 
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A TWO-POINT BOUNDARY PROBLEM FOR ORDINARY 
SELF-ADJOINT DIFFERENTIAL EQUATIONS 
OF FOURTH ORDER 


H. M. anp R. L. STERNBERG 


1. Introduction. The purpose of this note is to establish Theorem A below 
for the two-point homogeneous vector boundary problem 


(1.1) [Po(x)u’’|"” — [Pi(x)u’)! + P2(x)u = 0, 
(1.2) u(x,) = u’(x:) = 0 = u’(x2) = u(x2), 


where the P;(x) are given real m X m symmetric matrix functions of x with 
P(x) positive definite and P;(x) of class C?-‘ on an infinite interval [a, ~), 
and where by a solution of (1.1) — (1.2) fora < x; < x2 < © we understand 
a real m-dimensional column vector u = u(x) of class C? on [a, ©) which is 
such that P;(x)u-® is of class C?-‘ on [a, ~) and which satisfies (1.1) — (1.2) 
with the former a vector identity on [a, ©). 


THEOREM A. If for some real number ky > 0 each of the matrices P(x) — kol, 
P(x) and P,(x) is negative semi-definite on a, ~) and if there exists an ao > a 
such that for arbitrary x1, 2 satisfying ag < x1 < x2 < @ the only solution 
u = u(x) of (1.1) — (1.2) is the trivial solution u(x) = 0 on [a, ©) then the 
improper matrix integrals 


(1.3) f"P@) dx, f°<P.ce) dx, 


exist and for each m-dimensional constant vector x satisfying x*x = 1 we have 


(1.4) lim sup | xr f P(t) dtw | < 2ko, lim sup | xx f t’P,(t) dtr | < 8Ro. 


We note in passing that Theorem A can be considered to be an analogue for 
the present problem of non-oscillation theorems of Hille (1) and Wintner (4) 
for an ordinary linear differential equation of the second order; see in particular 
(1, §1). This analogy follows at once from the process employed in the proof 
below in which (1.1) is transformed into a vector differential system of the 
second order and twice the dimensionality for which the identical vanishing 
of the only solution of (1.1) — (1.2) corresponds to non-oscillation of the 
transformed system in the sense of Sternberg (3, §2). Theorem A is also a par- 
tial converse of a theorem essentially given by Kaufman and Sternberg 


Received September 10, 1953. Presented to the American Mathematical Society under a 
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(2, §§1 and 3) for a higher order homogeneous two-point boundary problem 
of which (1.1) — (1.2) is a special case. 


2. Proof. We may assume without loss of generality that a, > 0. It has 
been noted in effect in (2, §§2 and 3) that a solution u = u(x) of (1.1) defines 
by the relations y,=u, y, =u’, wu = P,(x)u’ — [Po(x)u’|’ a solution 
y = y(x) = (y,(x)), G = 1,2), » = u(x) of the system of 2m + m ordinary 
linear differential equations 


Lynd =[(¢ A sj +(‘)uI 
G1) 7 ie rot) a (_°)u]=0. 
&[y] = (10) ny + (0- n(*) = 0, 


in the sense of (3, §2) and that this system is a special case of the system (2.1) 
of (3, §2), the elements y: = yi(x), ye = yo(x) and uw = u(x) being m-dimen- 
sional column vectors. We now observe conversely that a solution y = y(x), 
pw = u(x) of (2.1) in the sense of (3, §2) defines by the relation u = y, a solu- 
tion u = u(x) of (1.1) in the sense stated earlier; moreover, from the form of 
(1.2) and the equation ®[y] = 0 in (2.1) it follows that if foray < x; < x. < @ 
the only solution u = u(x) of (1.1) — (1.2) is the trivial solution u(x) =0 then 
the system (2.1) is non-oscillatory on [a), ©) in the sense of (3, §2). Hence, 
under the hypotheses of Theorem A we have by Lemma 3.2 of (3, §3) that 


(2.2) J[n; ao, bo] = f ‘fn 'G(x)n! — 1 F(x)n] dx > 0 


for all bp) > a» and arbitrary admissible variations 7 = n(x) = (n,(x)) # 0, 
(j = 1,2) on [ao, do] satisfying (ao) = 0 = n(bo) and Y¥[n] = ¥(x)n’ = 0 
where similarly as in (2, §2) 


_(0 0 _ ( —Px(x), —xP2(x) ) 
- * mt —* Lo —P,(x)—x"Px(x)/ ' 

¥(x) = (Ix/), 
and by definition in (3, §3) 9:(x) and 72(x) are m-dimensional column vectors 
of class D’, the equation ¥[n] = 0 being required to hold merely in a piecewise 
manner on [do, do]. 

Next, employing the symmetry and negative semi-definiteness of the 
matrices P;(x) and P(x) on [a,~) to establish the existence of real m K m 
matrices Q,(x) such that P,(x) = —Q,*(x) Q;(x) on [a, ~) and, consequently, 
also such that 


(24) F(x) =K*(x)K(x), K(x) = (Q%) “ote? ont, ©), 


we note that the matrix F(x) is positive semi-definite on [a, @). 
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Now consider for ap < x; < x2 < x3 < x4 < bo the admissible variation 
n = n(x) = (n,(x)), (j = 1, 2) defined on [ao, bo] for each m-dimensional con- 
stant vector x satisfying r*x = 1 as 











(=). es 
—}3(x1 + ao) =, on [x1, x2], 
m(x) = ) —$(x%1 + ao) e + \— 4 T, on [X2, x3], 
X3 — Xe 
—3(x, + ao) + (x3 + X2) =, on [x3, x4], 
1 " - x - xe 
(2.5) —$(x1 + ao) + (%3 + x2)  — 3 TF eee [x4, bo], 
( x — do 
——_— } z, on [do, x1], 
x1 — Go 
rT, on [x, xs], 
n(x) = )ar—2 (s==) Tr, on [xe, xs], 
an @, on [x3, xa), 


=< b.) 
(g—5 = ie Tt, on [x4, bol, 


with x4 = x3 + (x2 — x;) and bo = x4 + (x1 — ad) and where x, and x2 are 
not to be confused with the x, and x, in the statement of the theorem. One 
readily verifies that for each x as described the admissible variation 7 = (x) 
given by (2.5) satisfies the conditions set forth below (2.2). Hence, employing 
the positive semi-definiteness of the matrix F(x) on [a, @) to obtain from (2.2) 
the relation 


Ze * bo * be * 
(26) O< f n F(x) ndx < f n F(x) ndx < J n 'G(x) »! dx 


we substitute (2.3) and (2.5) in (2.6) and use the negative semi-definiteness 
of the matrix Po(x) — koI on [a, ©) to establish the chain of relations 


0< - f x P;(x) mw dx — rf "(2x —x,- ao)’ x Ps(x) a dx 








e Fes) s *  Po(x) x be Py(x) x 
2.7) = 5 dx +4 7y* Slee 
( . e -— si Ze (xs — X2) a? z. (bo — x4)” . 
< ko r’ pl + by a 2ko Ako 


i 
Xi — do X33 — Xe bo — X4 x1 — Ge X3 = ZX 
for each m-dimensional constant vector satisfying r** = l. Since x;, x2 and 


x3 are arbitrary numbers satisfying a) < x; < x2 < x; it is clear that we may 
make x2, x; and x; — x2 as large as we please. Hence, employing the symmetry 
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and negative semi-definiteness of the matrices P,(x) and P:(x) on [a, @) once 
more it follows from (2.7) that each of the improper matrix integrals 


(2.8) f P,(x) dx and f (2x — x; — ao)*P2(x) dx 
exists and that 


0<-—(:-a)s fre) dx x < 2ko, 
(2.9) _- 
0 < — (x; — a) = f (2x — x; — ao)’ P2(x) dx x < 8ko, 


for each x satisfying x*x = 1 as before. The conclusions of Theorem A now 
follow readily. 
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ON THE CONTINUOUS SPECTRA 
OF SINGULAR BOUNDARY VALUE PROBLEMS 


C. R. PUTNAM 


1. Introduction. Suppose that p(t) > 0, that both p(t) and f(t) are con- 
tinuous functions on the half-line 0 < ¢ < @, and that A denotes a real para- 
meter. Only real-valued functions will be considered in this paper. Let the 
differential equation 


(1) L(x) + Ax = 0, where L(x) = (px’)’ — fx, 


be of the limit-point type (3, p. 238), so that (1) and a linear homogeneous 
boundary condition 
(2.) x(0) cos a + x’(0) p(0) sina = 0, 0<a<-rz, 


determine a boundary value problem on 0 < ¢ < @ for every fixed a. Let 
pa(A) denote the unique continuous monotone basis function on — © <A < o, 
normalized by pa(0) = 0, determining the eigendifferentials associated with the 
continuous spectrum, C, (3, pp. 238-251). 

It is known that the set S’ consisting of the set of cluster points of the spec- 
trum, S., is independent of a (3, p. 251). Furthermore, in the standard ex- 
amples of equations (1), the set C, is independent of a; if, for example, f(t) is 
periodic, (4). The question was raised by Weyl (3, p. 252) as to whether the 
continuous spectrum is invariant under change of the boundary condition 
(2.), that is, as to whether the set C, is always independent of a. Although this 
question will remain unanswered in this paper, except under a special assump- 
tion, it still seems to be of interest to compare the various existing basis 
functions p.(A), belonging to different values a. Except in explicit, special cases 
(cf., e.g., 3, p. 264; 2, p. 59), very little seems to be known in this connection. 
A contribution to some knowledge in this direction is contained in the following: 


THEOREM (*). Let p(t) > 0 and f(t) be continuous on0 < t < © and suppose 
that (1) is of the limit-point type. Suppose that there exist a fixed interval A and 
two distinct boundary conditions (2.,) and (2,,), a, * a2, such that A is in each 
of the sets Cz, and C,, and such that the basis function pa,(d) is an absolutely 
continuous function of pa,(r) on the interval A. Then 

(i) the interval A is in the continuous spectrum C, for every boundary condition 
(2.),0 < a < 2; and 

(ii) the basis function pa,(d) 1s an absolutely continuous function of every basis 
function pa(d) on the interval A (0 Ca < 2). 


Henceforth, for simplicity in notation, let p,(A) = pa,(A). for k = 1,2. It 
follows from (*) that, for any basis function p;(A) which is strictly increasing 
Received June 10, 1953. 
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on an interval A, there are only two possibilities: on the fixed interval A, either 
pi(A) is an absolutely continuous function of every basis function p,(A) (indeed, 
the interval A is in the continuous spectrum of every boundary value problem 
(1), (24) in the case (i)) or p;(A) is not an absolutely continuous function of any 
(other, existing) basis function p.(A) for which A is in C,. 


2. Proof of (i) of (*). Let ¢,(t, 4) = (t, A, a), for k = 1 and 2, be solutions 
of (1) satisfying 


(3) ¢(0,A) = —sina,, p(0) (0, A) = cos ay. 
Then the eigendifferentials are given by 

(4) d, (t, d) = dr (t, dr) dp,(r), 

where 


f (8,)"dt = dp, 
0 


for arbitrary 6 (see 3, p. 249). Let a ¥ a, a and let o(t, 4) = o(t, A, a) be the 
solution of (1) satisfying 
(5) ¢(0,4) = —sina, p(0) ¢’(0,A) = cosa. 


It will be shown that the interval A of theorem (#) is in the set C,. To this end, 
suppose, if possible, the contrary. Then there exists a subinterval of A, say 4, 
such that 6 has no points in common with the (closed) set C,. Clearly, there 
exist continuous functions A,(A) and A2(A) such that 


Since p;(A) is an absolutely continuous function of p2(A) on A, it follows 
(Radon-Nikodym) that there is a function B = B(d) such that 


(7) dpi(A) = BCA) dp2(A) on A. 
It is clear from (7) that 
(8) J, Btn < . 


(It is understood, of course, that if B is zero for some values A, then, in the 
integrations with respect to p;, the set 6 can be replaced by a set 4’ such that 
B > 0 on @& and Se dp; = fa dp.) Next, define the function M(t) by 


(9) Mi) = f§ B+Q) 6») dn), 
é 
so that, by (6) and (7), 
(10) MW) = f AB 4, dp, + J AB do 


In view of (7) and (8), the inequality (a + b)? < 2(a? + b?), and the properties 
of the eigendifferentials (4), one has 
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(11) fooa < af (A,? B™ + A”) dpi < @&, 
0 8 


so that M(t) is of class L?[0, ©). Moreover, M is differentiable and 


(12) M(O) = — (sin a) fo dp, p(0) M’(0) = (cosa) J BO dpi. 


Since each of the integrals of (12) is clearly different from zero, M(t) # 0. 
It will be shown that M(#) is orthogonal to all eigenfunctions and eigen- 
differentials belonging to the boundary value problem determined by (1) and 
(2.), and a contradiction will thus be obtained. 

Let uw denote an eigenvalue on 6 = [A;, As] of the boundary value problem 
(1) and (2,). It will be supposed that A, < uw < A»; the treatment in case yu is 
an end-point will be clear. Let 6, denote the set of values A: [Ai, uw — 1/n] 
+ [u + 1/n, As] (m large), and define M,(t) by 


(13) M(t) = [ Bodo. 
é, 
It will be first be shown that 
(14) foe t(t) dt = 0, 
0 


where £(#) denotes an eigenfunction belonging to uz. 
The functions ¢ and £ satisfy the equations 


(15) L(¢) +A = 0, L(E) + uE = 0 
(cf. (1)), and hence for every T > 0, 


(16) J et - eL@la = wr) J eae 


Moreover, an integration by parts shows that, for any two functions x, y 
possessing continuous second derivatives on 0 < t < @, 
|v 


(17) J, 20) — yLe@ dt = peey - xy) | 





(3, p. 223). An application of Fubini’s theorem for the interchange of the order 
of integration shows that 


(18) fmt dt = J. ( [6 it) B* dp, 


where M, is defined by (13). Relation (18) implies, as a consequence of (16), 
(17), and the fact that M, and é satisfy the boundary condition (2,), that 


T ° 
a9) fo med = J pcrnie(r.» en 
— o(T, d) &(T)](¢ — 4)*B(A) dei). 
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If A = A(t) is defined by 

(20) A(t) = ff 66. - 27 B40) dnd), 

it is seen that 

(21) fa a< of (A, B + A2’)(u — A) da < @ 

and that , 

(22) Jaa» dt <2 J, (A,*B™ + A3*) X(u — 2) dp: < © 


(3, p. 249, and relations (7), (8), and (11) above). Relation (19) can be ex- 
pressed as 


(23) J, Mak dt = p(T)A'(T) &(7) — A(T) ¥(T)) 


It follows from (21) and (22) and the fact that and L(£) also belong to class 
L*{0, ©) that the expression on the right side of (23) tends to zero as T+ @ 
(3, pp. 241-242). Consequently, relation (14) now follows. 

Next, it will be shown that 


(24) jue ) E(t) dt = 0. 


In view of (14), it is sufficient to show that 


(25) f (M(t) — M,(t))* dt-0, asn— @. 
0 
However 
p+l/n 
M — M, = f o(t, 4) BA) dpi (A) 
e p—l/n 


and hence 


«© u+i/n 
(26) f (M — M,)’ dt < of (A,B + A,’) dpi. 
0 


—ljn 

The right side of (26) tends to zero when m — @ and relation (24) now follows. 

It remains to be shown that M(t) is orthogonal to all of the eigendifferentials 
of the boundary value problem (1), (2.). To this end, it is convenient to assume 
that 6 = [—A,, Au], where A; > 0. (That this may be assumed without loss of 
generality is clear from the fact that the continuous spectrum is merely 
translated by a constant 7 if f is replaced by f + y.) If the set C, is not empty, 
then the eigendifferentials are given by 


(27) N = NU, J) = f $(t, ) dp(d), p(X) = pad), 


where J is an arbitrary (say, closed) A-interval. Since the closed interval 6 
contains no points in common with the set C,, it is sufficient to show that 
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(28) EO Nit, J) dt =0 


for all closed intervals J having no point in common with 6. 

Consider then an interval J = [:, we], where A; < mw. (The case in which 
ue < —X, can be treated similarly and will not be considered separately.) 
Suppose then that d is in 6 and that uz is in J. It follows from the equations 


(29) L(o(t, 4)) + AG(t, A) = 0, L(t, u)) + udlt, w) = 0 
and the relations (16) and (17) that 
T 
(30) ff 90,n) o¢,u)ae 
= p(T)[¢'(T, ) o(T, 4) — o(T,d) (7, u)] G — A). 

It follows readily from (30) that 

T oo 
G1) f MUNG, Jat = D pTIAN (TD) BAT) — A(T) By (1) 


where M, N, A,, and B, are defined by (9), (27), and 


(32) A.) = f xB) 66,2) dn), Bal!) = J $(t, u) wn" do(n). 


(The interchanges of the order of integration together with the interchange 
of the summation and integration are readily seen to be justified.) Relation 
(17) implies 


T 
(33) p(T)[An'(T) B,(T) — An(T) B,’(T)] = f [B, L(An) — An L(B,)] dt. 
By the Schwarz inequality, 


T x j oo ; 
f B, L(A,) dt| < ( f B. dt) ( f (L(A,))* dt) 


From (32) and the fact that J = [y:, pol, 


(35) J Bat < fu” dtu) < uct ff dplu) < ©. 
J J 


Furthermore, relations (6), (7), and (10) imply that 





(34) 


(36) L(A,) = - f r*** 4, B oi dpi — J r"*" 4, B $2 dp. 
é s 


Hence (cf. (4)), 
(37) J (L(A,))' dt < af n+ (4 YB + A”) dor 
° 3 


< ime | (A,B + A;’) dp, < @, 





ge 
on 
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In particular, the functions B, and L(A,) are of class L*(0, ~), while a similar 
analysis shows that A, and L(B,) are also in class L*[0, ©). Consequently, 
each term of the summation of (31) satisfies 


(38) P(T)[A,'(T) B,(T) — A,(T) B,’(T)] > 0, as T— @ 


(3, pp. 241-242). 
It now follows from (35), (37), and the inequality \,u;~' < 1, that 


@ 


T | 
f B, L(A,) dt | +0, aN~ oe, 
0 


n=N 
holds uniformly in T (0 < T < @). Similarly, 


@ 


T 
7 f A, L(B,) dt | — 0, asN — @, 
0 


n=N 


holds uniformly in T (0 < T < @). Hence, the series on the right side of the 
equation (31) tends to zero as T — @ and so (28) follows. Thus the function 
M(t) of (9) is orthogonal to all eigenfunctions and eigendifferentials of the 
boundary value problem determined by (1) and (2,) and, as remarked earlier 
in this section, a contradiction is obtained. This completes the proof of part (i) 


of (#). 





3. Proof of (ii) of (+). Let ¢ be defined as in §2, and let d@ denote the eigen- 
differentials, so that 


(39) d&(t,r) = P(t, dA) dp(d). 
Let M(t) = M;,(t) be defined by (9) where, now, 4 is any interval contained in 


A. Then, by (3, pp. 250-251), the function M;(¢) has an expansion 


(40) Milt) = ¥ ce os (t) + fF $(t, d) aT(a), 


where the ¢, denote the eigenfunctions of the boundary value problem (1), 
(2.4) and the c, and dI'(A) are given by 


(41) q= f M,(t) &(t) dt, &T = f M(t) 6’ & dt (8 arbitrary). 
0 0 
In view of the uniqueness properties associated with the expansion (40), 
however, it follows from (40) and (9) that c, = 0 and 
(42) B™ (x) dox(A) = dT'(A) 


holds on the interval 6. Thus, provided 4’ is contained in 4, relation (42), the 
second relation of (41), and the Schwarz inequality imply 


+n ; oo i 
(43) J. B* dp, < (f Mat) ( f ay at). 


Henceforth, it will be convenient to put 6’ = 6. From the properties of the eigen- 
differentials (39), 
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(44) J @ay dt = dp. 


It follows from (43), (44), (11), and the Schwarz inequality that 


(45) (> f ean) <AZ fate +4s)an)(5 fo), 


where the summations are taken over any sequence of intervals 5 contained in 
A. Let Z denote any subset of the interval A for which 


f dp(rA) = 0. 
Zz 


It follows readily from (45) and (8) that 


2 
(f B(x) dox(n)) < const. J dp(d) = 0): 
z Zz 
f dp = f B' B+ dp, = 0. 
Zz Zz 


Thus the variation of p;(A) is zero over any set Z over which the variation of 
p(A) is zero. Hence p;(A) is an absolutely continuous function of p(A) (that is, 
by the Radon-Nikodym theorem, there is a function C(A) such that 
dpi(A) = C(A) dp(A)) and the proof of (ii) of (*) is now complete. 


hence, 
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A TENSOR BOUNDARY VALUE PROBLEM 
OF MIXED TYPE 


G. F. D. DUFF 


The boundary value problems of generalized potential theory on finite 
Riemannian manifolds may be regarded as extensions of the Dirichlet and 
Neumann problems for harmonic functions. In the tensor theory there is, in 
fact, a greater variety of such problems; that is to say, these generalizations 
from classical potential theory can be made in various ways. We here intro- 
duce yet another pair of boundary value problems for the tensor equation of 
Laplace. 

Two boundary value problems for harmonic p-tensors, which, for p = 0 or 
pb = N, reduce to the classical Dirichlet and Neumann problems, were dis- 
cussed in (1.b) by means of the Poincaré-Fredholm integral equation tech- 
nique. In one, values of components are assigned on the boundary, while in the 
other, values of components of the derivative and co-derivative are specified. 
As in the scalar problems, these are related, inasmuch as the system of integral 
equations appropriate to the one, when transposed, leads to the other. The 
whole formulation is invariant: that is, only tensor quantities and operators 
defined invariantly on the boundary surface appear in the statements and 
proofs. 

In this paper we shall.discuss a second invariant generalization of the 
Dirichlet and Neumann problems. This type of problem is mixed, in the sense 
that values both of components and of their first derivatives are assigned at 
each boundary point. Although there are at first sight two problems of this 
kind, again related by transposition, the second mixed problem for p-tensors 
is equivalent to the first mixed problem for (N—>)-tensors. The eigentensors 
of these problems are harmonic fields whose tangential or normal components 
vanish on the boundary, while the dimension of the eigenspace is a relative 
or absolute Betti number of the manifold. By specializing the boundary values, 
we obtain theorems for closed or co-closed harmonic forms, and also for har- 
monic fields, thus bringing these hitherto separate theories together with that 
of the tensor Laplace equation. 


1. Formulation of the problem. We consider orientable Riemannian mani- 
folds of dimension N and differentiability class C”. M will denote a compact 
manifold with boundary B of dimension N — 1 and class C”, while F will 
denote a closed and compact manifold which is the double of M. A positive 
definite metric tensor gm of class C” is supposed given on M, and can be 
extended to F. 
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On F we consider skew-symmetric covariant tensors 
Pi... 4 


with which are associated exterior differential forms ¢ of degree p. We have the 
differential operator d, the dual operator *, the co-differential 


$= (- 1 Wont ads 


and the Laplacian A = éd + dé. Precise definitions of these are given in (1.b) 
or (5) to either of which we refer for brevity. The scalar product (¢, ¥)- 
=f, @ A *¥ defines a Hilbert space of p-forms on F (or on M, if the integration 
is extended over M) since N(¢) = (¢, ¢) is positive unless ¢ = 0. 

As in (1.c) we shall make use of double skew-symmetric p-tensor fields 


’ 


ee ee which are symmetric in the two groups of indices 7,...i, and 
= S 

We shall approach the study of the Laplace equation for harmonic forms 
(1.1) Ad = 0 
via an equation of type 
(1.2) Le = Ao + Ad = 0, 


where A¢ is the differential form corresponding to the tensor 


(ja..-Jp) 
A 6,... tp. (Ss--- dp) co) ‘ . 


and where the matrix of independent components of A will be taken as positive 
definite. Green’s formulae for (1.2) are 


(1.3) (do, d¥) + (66,84) + (4, A¥) — (¢, Ly) = J consay ~ 8YA+9), 


and 
(1.4) (¥, £6) — (6, Lv) = f (onsay — s¥rre - VAsdd + 56A+y), 


where the integrals on the right are taken over the boundary surface of the 
domain of integration indicated by the round brackets. From (1.3) we see that 
the Dirichlet integral for (1.2) is 


(1.5) E(¢, ¥) = (do, dp) + (54, 5p) + (¢, AV) = Ely, 4). 


When A is positive definite, E(¢,¢) > 0 unless ¢ = 0. The formulae for 
Laplace’s equation are found by setting A = 0 in the above. Then E(¢, ¢) = 0 
implies only that dé = 0 and é¢ = 0. 

In (1.c) it was shown that, for A positive definite, the equation (1.2) has a 
fundamental singularity in the large g,(x, y) in any compact closed space F 
satisfying our conditions. As in the paper referred to, we use this singularity 
for the construction of single and double layer potentials. 

The boundary operators ¢ and n, satisfying #t = m*, #n = ts, are defined as 
in (1.b). Thus ¢@ is the induced p-form on the boundary B. 

From (1.b) we recall that in the first (or Dirichlet) boundary value problem 
for harmonic forms, the quantities ¢¢@ and n@ are assigned on B, and that the 





it 


or 
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eigenspace O for this problem is finite-dimensional, depending on the topologi- 
cal structure of the manifold, or possibly zero. In the second or Neumann 
problem we assign nd@ and 14¢, subject to a certain orthogonality condition. 
The associated eigenspace F is the space of harmonic fields (dé = 0, 56 = 0) 
on M, and is known to be infinite-dimensional if 1 < » < N — 1. The corre- 
sponding problems for (1.2) with A positive definite have unique solutions. 

The two boundary value problems which we now introduce shall be known as 
the K and M problems. In the K problem we assign ¢¢@ and #é¢; in the M 
problem, n@ and nd@. It is easily verified that the number of conditions so 
prescribed is, in each case, equal to the number (3) of independent com- 
ponents of ¢. These boundary conditions are also self-adjoint, in the sense that 
if ¢ and y both satisfy one of these homogeneous conditions, the right hand 
side of (1.4) vanishes. 

The mixed boundary condition of Robin’s type which was discussed in 
(1.b, §6) can be made to yield the Dirichlet (D), Neumann (N) or the K and 
M problems as limiting cases. For example, to obtain formally the K boundary 
condition, from (6.4) of (1.b), let A, — © and Ay_, — 0. It is clear that these 
limiting cases must all be treated separately since the hypotheses of (1.b, §6) 
are not then satisfied. 

The eigensolutions of these problems may be characterized with the help of 
(1.2). We see that Lo = 0, te = 0, 44g = 0 imply E(¢, ¢) = 0, the integral 
being taken over M. Thus ¢ = 0 in M, whenever A is positive definite, and if 
A is zero identically, we still have d@é = 0, 56 = 0 but not necessarily ¢ = 0. 
Similarly, if L¢ = 0, nd = 0, ndd = 0, we have E(¢, ¢) = 0 with ¢ = Oif A 
is positive definite, and d@ = 0, 66 = Oif A = 0. 

Thus the independent conditions satisfied by a solution of the homogeneous 
K-problem are d¢ = 0, 66 = 0 and t¢ = 0. In (2) it was shown that if the 
relative periods of ¢ on R,(M, B) independent relative p-cycles are assigned, 
¢ is uniquely determined. Indeed, if these relative periods are given to be zero, 
then, according to (l.a), @ is the derivative dx of a (p — 1)-form x whose 
tangential part vanishes on B. Thus, by the shorter form of Green’s formula, 


N() = (64x) = (64, x) + J xare = 0, 


so that ¢ vanishes identically. Therefore the dimension of the eigenspace K 
of the K problem is R,(M, B) = Ry_,(M), by the Lefschetz duality theorem. 

Reasoning exactly dual to this shows that the eigenspace M of the M 
problem has dimension R,(M) = Ry_,(M, B). Indeed, if ¢ satisfies a boundary 
condition of the K type, its dual «¢ isan N — p form satisfying a corresponding 
boundary condition of the M type. 

We remark that the intersection of the eigenspace K and the eigenspace M 
is the eigenspace O of the Dirichlet problem in which both ¢¢ and n@ vanish. 

The letters K and M will also be used to denote projection operators (in the 
L* norm) on the K and M eigenspaces. In this connection we have the operator 
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relations dK =0, 6K =0,dM=0,5M=0. If p€ K, and y is arbitrary, then, since 


(p, bY) = (dp, ¥) — Jeasy = 0 
we have Ké = 0. Similarly Md = 0. In fact, this last follows also from the 
evident formulae «VM = Ks, «K = Ms. 


2. Potentials. Let g = g, (x,y) denote the fundamental solution of 
Ad + Ad = 0 (A positive definite) in the double F. Based on this double form 
we have the potentials 


(2.1) w= J rsdg — serve) = f (Arnis + egrndeo) 
B 
and 
(2.2) y= f (gA*do — dgA*c) = J eArde + #0 A *d+g), 
B B 


each of which contains both single and double layer terms. The layer densities 
tp, tip, ndo and no are assumed here to be Hélder continuous on B. We shall 
calculate the discontinuities of these quantities as the argument point crosses 
B from M to the complementary part CM in F, and for this purpose we use 
the formulae of §3 of (1.b), noting that the singularity of the de Rham kernel 
g(x, y) is asymptotically equal to that of g, (x, y). 

From (3.5) of (1.b) we see that the first term of tu increases by tp: the second 
term of ty is clearly continuous. From the same formulae we see that the first 
term of f#u is continuous, and so also is the second. To calculate the discon- 
tinuity of t#du, we have 


t+du = ted f (p A *dg — dp A *g) 
B 


and from (3.6) of (1.b) we see that the second term is continuous across B. If 
we take the dual of (4.12) of (1.b) we see that the first term is also continuous. 
Lastly, we must examine ¢* dey. According to (3.5) of (1.b) the second term 
of t* dey decreases by t# dep, while from (3.6) of the same paper, the first term 
is continuous. Analogous results for the dual potential »y may be found by taking 
the duals of the above results with p replaced by N — p; or directly. Collecting 
together the results so found, we see that tu, t* dey, tev, and t* dy have the 
respective discontinuities tp, —t# dep, —t#c, and t# do; while teu, t# dy, tv, and 
tx dev are continuous across B. 
We conclude that on B, we have as limits from M, 


tu = dip+t Jo A *dg + *g A *d*p), 
(2.3) t_sdeyp = — 4ted*p + tnd | (o A *dg + *g A *d*p), 
t_edy = — }tado + ted fe A *da + *0 A *d+g), 
t_ey = }te0 + tx Jc A *da + *0 A *d+g). 
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Here the integrals on the right are understood to be evaluated on the boundary. 
The singularity of g,(x, y) for x = y is such that principal values of these 
integrals must be taken. The — sign appended to the operator ¢ on the left of 
each equation indicates a limiting value from M; a + sign will be used to 
indicate limits from CM. 

The reasoning of the Poincaré-Fredholm method now shows that the solu- 
tion of the K problem with assigned data 9, té¢ is to be sought by solving the 
system of singular integral equations 


(2.4) tu = td, t_adey = teded. 
Similarly, the solution of the M problem will be found by solving the system 
(2.5) t_ady = tadd, t_ev = tad 


with given data t#¢, ted¢. 
The kernel of the system (2.4) is 


a ty *, dy ga(x, ¥), t, ty #y ga(x, 9) 
t, ty *z d, *, Fy d, ga(x, y), ty ty *2 d, *y 2a (x, y/; 


while the transpose of this kernel, namely 


(‘ ty *, d, ga (x, 9), te ty *, d, *, d, *y £a(x, “ 
t, ly ¥, ga(x, y), t, ty *, *, d, *y ga(x, ¥) ’ 


is the kernel of the system (2.5). Thus the analogy with the case considered in 
(1.b) is complete. 


3. Solution of the integral equations. The condition for the compatibility 
of (2.3) or (2.4) is, that the non-homogeneous terms should be orthogonal, over 
the domain of integration B, to every solution of the homogeneous transposed 
equations (3). In each case the homogeneous transposed equation arises when 
we try to solve the boundary value problem of the dual type for the domain 
CM. 

For the K problem we will show that any H-continuous solution of the 
homogeneous transposed equation 


0 = $tedo + ted f (g A *do + *0 A *d*g), 
(3.1) 4 


0=—- jive + f (¢ A *da + *0 (A *d+g), 
B 


is identically zero. The potential v of (2.2) corresponding to any such solution 
o satisfies on B, 


(3.2) t, adv =0, t, wv = 0, 


and also is a solution of Avy + Av = 0 in CM. Thus the Dirichlet integral over 
CM of » is zero, and hence » vanishes identically in CM. Passing through the 
boundary B to M, we see that 
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(3.3) w=0, te = —tec, t_edv = tadc, t_adev = 0. 


From the first and last of these relations, it follows that v has a zero Dirichlet 
integral in M, and so vanishes identically. Thus finally tec = 0 and t#edo = 0. 

That is, according to (3), the equations (2.4) possess a solution for arbitrary 
continuous data. Reasoning along the lines of this proof, we could easily show 
that this solution of the integral equations is also unique. The proof of existence 
of a solution of (2.5) parallels that just given and so will be omitted. 


THEOREM I. Jf A is positive definite in M, the differential equation A¢+Ao=0 
has unique solutions with either to, td or nd, nd having assigned H-continuous 
boundary values. 


In the usual way we can now assert the existence of Green’s forms corre- 
sponding to the K and M problems on M. These are obtained by subtracting 
from the fundamental singularity g, (x, y) solutions of the differential equation, 
the appropriate boundary values of which agree with those of g, (x, y). For the 
K problem we find a domain functional K,(x, y) which satisfies 


(3.4) AK +AK =0 (xy); 4K =0,t,5,K =0; 

K (x, y) ~ g(x,y), («§ ~ 9). 
The symmetry property K(x, y) = K(y, x) is easily established. For the M 
problem we construct M,(x, y) satisfying 
(3.5) A.M+AM=0 (x# y); n,.M =0,n24,M = 0; 

M(x, y) ~ g(x, y), (x ~ y). 
Finally, we see that +, +, K,(x, y) = My-_,(x, y). 


4. Laplace’s Equation. The above method requires modification when 
applied to Laplace’s equation, for which the matrix A is zero. We shall con- 
struct a modified Green’s function in F, which will appear in the kernel of the 
integral equations corresponding to (2.4). For this purpose let Ao be a suffi- 
ciently differentiable matrix tensor which is positive definite in M and vanishes 
in M (1, c). Then to the differential equation 


(4.1) Ad + And = 0 
there corresponds the Dirichlet integral 
(4.2) Do, 6) = N(do) + N(5¢) + (, Aco) M. 


Any form which is harmonic in M and vanishes in M annuls this integral and 
is an eigenform of (4.1) in F. That is, the eigenspace of (4.1) isO = M(\K 
as defined in § 1. 

The method of (5) for constructing a Green’s form applies to (4.1) as in 
(1, c), and it is a straightforward matter to verify in the-present case the 
existence of a Green’s form go(x, y) for (4.1) which satisfies the following 
equation: 





e 
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(4.3) A; go(x, ¥) = —ao(x, y), x ty, 
where 

ao = 2) w4(x) w4(y) 
is the reproducing kernel in M of the orthonormalized forms w,(x) of the eigen- 


space O. Thus ao(x, y) is a harmonic field with vanishing boundary values on 
B. Hence also 


dé dgy = (Q, 5d 5go = 0. 


This kernel is symmetric as usual. With go(x, y) we may construct surface 
layers of the types (2.1) and (2.4), and for the remainder of this section jo 
and v» will denote these modified potentials. 

If p is an eigenform of the K problem, that is, if dp = 0, 5p = 0, tg = 0, and 
if A@ = 0 in M, then from Green’s formula we have 


(6, 44) + Dio, 6) = f (oA xdd — 56 A #0). 
The left-hand side vanishes, as does the first term on the right. We then find 
(4.4) f 36 =0 
B 


as a necessary condition for the solution of the K problem by harmonic forms. 
Writing 
(4.5) fo = fo A *dgo — dp A *go) 
B 
we see that 


du = f ip A vay = 0 
B 


since fag = 0. That is, a surface layer (4.5) is a harmonic form. The solution of 
the K problem will then be attained if there exists a surface layer (4.5) satis- 
fying the boundary conditions. 

The integral equations of the problem again take the form 


(4.6) tuo = tp, t_dpuo = tad. 


A solution (tp, tip) exists if and only if the nonhomogeneous terms (t¢, té@) are 
orthogonal to every solution of the homogeneous transposed equations. As in 
§3, to such an eigensolution (t#dc, tec) corresponds a potential vo, satisfying 
(4.1), with 

(4.7) tiadvo = 0, tyevo = 0 on B. 


Since Ao is positive definite in M, we find 
(4.8) » =0 in M. 
The discontinuity conditions then yield 


(4.9) typ = 0, t_evo = —teac, t_adyy = taeda, t_edevy = 0. 
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Since go(x, y) satisfies (4.3), we find 
(4.10) Avo = Do crt, 
where the ¢, form a basis for the eigenspace O. Then in M, we see from (4.9), 
Yo = f (go A «do — 6g0 A *o) = J (go A *dvo ~ 6g0 A\ *¥9). 
B B 
Therefore 
(4.11) Avy = — f (ao A *dvo + dao A *V9) = 0, 
B 
since da = 0 and tay = 0. Thus » is a harmonic form in M. Next we calculate 
Dy (vo) => (dvo, dvo) a (dro, bvo) 


= (vo, Avo) + | (vo A *dvo — bvo A *¥9) = O 
B 


since Avo = 0 and t_vy) = 0, t_*d#vy = 0. Therefore vo is in fact a harmonic 
field in M, and since t_vp = 0, vo is a member of the eigenspace K. 

The orthogonality condition sufficient for the existence of a solution of the 
integral equations (4.6) now becomes 


f(A ado — 56.0 40) = — f 59. +0 
B B 


J b¢ A *Vo, 
B 


in view of the second of (4.9). That is, the necessary condition 


0 


J se A yy = 0, Vo * K, 
B 


is sufficient for the existence of a solution of the integral equations. The final 
result then follows from our remark that yo is a harmonic form. 


THEOREM II. There exists a solution $, of Ad = 0, td = tt, tid = tbt, tf and 
only if 


B 
for every harmonic field p for which tp = 0 on B. 


In particular, if R,(/, B) = 0, the condition (4.12) is satisfied. 


5. Co-closed harmonic forms. The K problem is solvable if the given 
values for té@ are all zero. We will show that in this case the solution ¢ is co- 
closed throughout M. Since 6A¢@ = édig@ = 0, we have 


N(did) = (did, did) = (60, bdigd) + | bo A *digd = 0, 
B 
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by Green’s formula, and since #@ vanishes. Thus di¢ = 0 identically, and so 
also dd@ = 0. Now 


N(69) = (66,54) = (6,43) — f 36. A +9 =0, 
since did = 0 and té¢ = 0. Thus 6¢ = 0 as stated. 


THEOREM III. There exists a co-closed harmonic form having a given tangential 
boundary value. 


This result is similar to but not identical with Theorem 2 of (2) which states 
the existence of a solution to the problem éd@ = 0, t@ = té given on B, and that 
if £ is of the form 6x, then there exists a unique solution ¢ of the form dy. The 
theorem just established is stronger than the first of these two statements, and 
more general than the second since the values of /@ are restricted only by 
continuity. 

If we further restrict #@ to be equal to the values of a derived form da 
defined on B, we can show that d¢ is zero in M. The restriction on ¢@ will be 
satisfied if dgté = 0, and if t@ has zero periods on all p-cycles of B. (1.a) 
We see, in fact, that 


Nido) = (dd, d¢) = (¢, ddd) + Je A +d. 


The volume term disappears since dd@ = 0. Since t@ = ida, we have 


dpt(a A #do) = td(a A *dd) 
= t(da A *dod) + (—1)? t(a A dado) 
= t(@ A *d¢), 


again since ded@é = +5 dd = 0. Thus, by Stokes’ theorem, 


fonrds=f arsdg=0 
B oB 


since the boundary dB of B is zero. Finally, d@é = 0 in M. This result is a weaker 
form of the Dirichlet theorem for harmonic fields (2, Theorem 3). 

Another type of condition which ensures that the orthogonality condition 
(4.12) be satisfied is that the assigned values of té@ on B should be equal to a 
derived form dgx,~2 defined on B. If p is an eigenform, we have dep = 0, and 


dpt(x A *p) = da(x A tap) 
dpx A tap + x A dglep 
lid A tap + x A ldap 
t(dd A *p). 


ll 


Hence, by Stokes’ theorem, 


foornw=f x A *p = 0, 
B oB 


since B is closed. That is, (4.12) is satisfied as stated. 
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When the assigned boundary values of /é¢ are of this type, and when also the 
values of #@ vanish, the corresponding solution of the K problem is closed. In 
fact, we have 


THEOREM IV. There exists a closed harmonic form @ = , with to = 0, 
tid = dgxp-2, where x2 is a p — 2 form of class HC' defined on B. 


The orthogonality condition being satisfied, there exists a harmonic form 


@ with t@ = 0, ti@ = dzx. To show that ¢ is closed, we first calculate 


N(dé¢) = (8¢, 8d5¢) - fs A *db¢. 
B 


The volume term vanishes since 6di@ = 5A¢ = 0. For the surface integral, we 
have 
dat(x A *dé¢) 


d(x A tsdé¢) 

dpx A tedid + x A tdedio 
dpx / tadid + x A teidigd 
tid A tadid + 0 

t(d@ A *dd¢). 


The integrand being a derived form on B, the surface integral over the closed 
boundary manifold B vanishes, by Stokes’ theorem. That is, N(di@) = 0, so 
did = 0. Again, since Ad = 0 we have also dd¢@ = 0. Thus 


N(d¢) = (¢, 6d¢) + ik A +d¢ = 0, 


since dp = Oand t@ = 0. Therefore, finally, dé = 0 and Theorem IV is proved. 

This proof demonstrates that a harmonic form with t¢ = 0, tid = dpgx is 
necessarily closed. However, the sufficient condition t¢ = 0 may be replaced 
by the condition nd@ = 0 which is clearly necessary, and the remark d¢ = 0 
still holds. To show this we refer to the Neumann boundary value theorem 
(1.b, Theorem I1). It is a consequence of this result that there exists a har- 
monic form ¢@ with nd@ = 0 and té¢ = dzx if and only if 


fdox A *r =0 
B 


for every harmonic field r defined throughout M. But 


dgt(x A #7) = dex A ter + x A der 
t(dex A *T), 


since der is zero, and so this condition of orthogonality is satisfied. Therefore a 
harmonic form ¢ satisfying the indicated boundary conditions exists. Now we 
have Ad = 0, and therefore 


tedid = —tsidd = +tdedd = +datedd = 0, 


since ted@ = *nd¢@ is zero. Thus the surface integral in (5.2) vanishes; and, 
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since the volume integral is again zero, we conclude that dé¢ vanishes. Then, 
since also éd@ must be zero, we find that 


N(d¢) = (4,846) + f 6A do =0, 
since ted@ = *nd¢ is zero. That is, d@ = 0 holds in this case as well. 


6. Green’s operators for Laplace’s equation. From (4.10) we observe that 
the equation A¢ = £ is solvable in M with the homogeneous boundary condi- 
tions t@ = 0, té¢ = 0, if and only if (p, 8) = 0 for all p € K. Thus the modified 
equation A¢ = 8 — K@ is solvable for arbitrary 8. Since the solution is un- 
determined to the extent of an additive eigenform p € K, uniqueness may be 
secured by the additional requirement K¢@ = 0. Writing ¢ = G8, we define 
the Green’s operator Gx for the K problem on the given domain. As in (1.b), 
(5), we see that the kernel gx(x, y) of this operator is a double p-form with a 
singularity asymptotic to the local fundamental singularity of Ad = 0, as 
x ~ y. Also gx (x, y) satisfies the differential equation 


(6.1) Ay gx(x,y) = —k(x, 9), x #Y, 
where 

R,(M,.B) 
(6.2) k(x,y) = au pr(x) pr(y) 


is the reproducing kernel of the eigenspace K. 

Since tGg@ = 0 and t6 Ged = 0, we find that t, gx(x, y) = O and t, d,gxn(x, y) 
=0. Also, to the orthogonality relation KG = 0 corresponds the formula 
(p(x), gx (x, y)) = Oforeach p € K. The symmetry property gx(x, y) = g«(y, x) 
now follows in the usual way from Green's formula. Thus Gg is self-adjoint. 

Let @ be any p-form; we calculate 


Gedo = (gx, 4¢) 


+ f(A eden — ben A #6 — te A 2d6 +56 A Bn) 
B 


o-Ke+ Je A thee +86 A tx). 


Since tGg = 0, 8Gg = 0,andtK = 0, 4K = 0, we see that the surface integral 


(6.4) Pro = — J (6A sdge +56 A *En) 
satisfies 

(6.5) tPed = to, t6Peod = tid. 
Moreover, 


aPro = + f ig A *k, 
B 


and this surface integral is zero if and only if the orthogonality condition (5.1) 
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holds. That is, if to, td@ satisfy (5.1), then Px@ is the harmonic form which 
solves the K problem with these boundary values. From (6.2), (6.3) and (6.4) 
we deduce the operator equation 


(6.6) AGr — GrA = Px. 


Similar formulae are valid for the dual M problem. There exists a unique 
self-adjoint operator Gy with kernel g(x, y), satisfying relations correspond- 
ing and dual to (6.2)-(6.10). Moreover, 


#,%, ox (x,y) = gu? (x, 9). 


7. Examples. To find concrete examples of the existence theorems here 
proved, we turn to Euclidean space of two or of three dimensions. In the plane, 
a 1-form ¢, in Cartesian coordinates, is a differential 


@ = Pdx+Qdy. 
The dual is 
*p = Odx — P dy, 
while 
do = (Q, — P,) dx dy, bd = P, + Q,. 


Thus the main existence theorem may be’stated in terms of the two coefficients 
P and Q which will be harmonic functions if ¢ is harmonic. Let R denote a 
simply-connected region, bounded by a smooth curve C of arc length para- 
meter s. We see that there exist unique harmonic functions P and Q, such that 
the quantities 


dx dy 
vat ° a P. + Q 


take assigned Hélder continuous values on C. For in this case the K problem 
has no eigenforms. Also, if P,+Q, is given as zero on C, it is everywhere zero. 
The dual problem may be stated by replacing P, Q with Q, —P, respectively. 

On the annulus a < r < b, r? = x? + y’, the K problem has the eigenform 
p = dr. Since t#p = t#edr = rd@ = ds, the condition of solvability is 


J_e: + Q,) ds = Jie. + Q,) ds. 


Tr 


In Euclidean three-space, a 1-form ¢ and the related quantities may be 
written, in Cartesian coordinates, 


@ = Pdx+ Qdy+ Rdz, 

*@ = Pdydz + Qdzdx + Rdx dy, 

do = (R, — Q,) dy dz + (P, — R,) dz dx + (Q, — P,) dx dy, 
5g = —(P.+ Q,+ R,); 


the differential corresponding to the curl and the co-differential being the 
divergence, of the vector (P, Q, R). Two mixed boundary value problems may 
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be stated for this vector, on a multiply-connected region R of space, bounded 
by a boundary surface B. 

First, consider the K problem for the 1-form @ (or, equivalently, the M 
problem for its dual #@). Here we assign two tangential components 7), 7; of ¢: 


mi 
5 


dx dy dz P 
seers + Q7a+Raa, i = 1,2, 
where s' and s* are parameters of B; and the value of 4¢, or of 


P,+ Q, + Rz. 


The number of independent eigenforms is equal to the number of independent 
relative l-cycles R; and in particular is zero if the boundary surface is con- 
nected and if R is simply-connected. In this case we may state that there 
exist unique harmonic functions P, Q, R satisfying the above boundary condi- 
tions. 

Furthermore, if (P, + Q, + R,) is given as zero on B, then it is zero through- 
out R; that is, the vector (P, Q, R) is solenoidal. If also the condition t@ = tdx 
is satisfied as in §5, then d@ = 0 which in this case implies that the vector 
(P, Q, R) is irrotational as well. The condition t@ = tdy takes here the form 


T, ds! + T, ds? = dF(s', s*), 


which will hold if 87 ,/ds? = 87 ./ds' on B, and if also for each absolute 1-cycle 
A' of B we have 


if io = fit ds' + Tz ds*|] = 0. 


Secondly, we see that the M problem for ¢ is equivalent to the K problem for 


+d, and that in this problem there are assigned values of the normal com- 
ponent 


dx dy dz 
NeoPa tea. t*e 


(m denoting normal distance from B, locally), and the two components 


of the curl or differential dé. The eigenvectors of this problem are harmonic 
vectors (irrotational and divergenceless) with vanishing normal components 
on B. That is, they are the secondary flows or circulations of Kelvin (4). The 
number of independent flows of this kind is equal to the number of absolute 
l-cycles (irreducible circuits), or, equivalently, to the minimum number of 
2-dimensional diaphragms needed to make the region simply-connected. Thus, 
if R is simply-connected, we may assert the existence of unique harmonic 
functions P,Q, and R which satisfy the boundary conditions. In the more 
general case when R is not simply connected, the orthogonality condition is 
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easily written down in terms of components. Again, we see from the dual of 
Theorem III that if the two assigned components of the curl are zero on B, then 
the curl vanishes inside. 

We may here also apply the second result of §5 to the dual vector #@: thus 
if tep = tdex on B, we conclude that d*e@ = 0, i.e. that the divergence of 
(P, Q, R) vanishes. This necessary condition takes the form 


N ds' ds? = d(f,ds' + fz ds*), 


where f; and f, are single-valued on B. Since B is two-dimensional, the two- 
form Nds' ds* is automatically closed; by de Rham’s second theorem it is 
derived if its periods are all zero; that is, if 


Nds‘ds* = 0 


By 
for each component B, of B. 

The preceding remarks may be summed up as follows. Suppose that each 
Cartesian component of a vector field (P,Q, R) is harmonic in a multiply- 
connected region R of space. Then if the vorticity vector on the boundary is 
everywhere normal to the boundary, the vector field is irrotational, while if 
there is zero net inflow over each boundary component, the vector field is 
solenoidal in R. 
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ON A DOUBLE INTEGRAL VARIATIONAL PROBLEM 
P. R. GARABEDIAN anv M. SCHIFFER 


1. Introduction. The results presented here were motivated by a desire to 
give a simple treatment of the non-linear elliptic partial differential equation 
governing the steady irrotational subsonic flow of an ideal compressible fluid. 
For two independent space variables x and y, the stream function y of such a 
flow satisfies an equation 


oO , 2 2 ¢é , 2 2 - 
(1) 5g PW + Vi) vel + 5 IPE + i) val = 0, 


where F = F(y,?+ y,”) is an analytic, increasing, convex function of 
q = (¥. + y,?)! whose explicit form depends on the equation of state of the 
fluid in question. Our analysis of (1) will be based on the fact that it is the 
Euler-Lagrange equation for the double integral variational problem 


(2) f frou + v3) dx dy = minimum. 


We shall introduce several devices for analyzing (2) which prove to be particu- 
larly successful for the case F = (1 + y,? + y,°)! of the Plateau problem. 

Shiffman (5) has given a proof of the existence of subsonic compressible 
flows based on (2). His work is part of an extensive literature on the calculus of 
variations and on non-linear elliptic partial differential equations of which the 
contributions by Haar (2) and Radé (4) come closest to the point of view of 
our paper. The deduction of a priori estimates on the derivatives of the solution 
y of (2) and the discussion of the analyticity of y are key developments in the 
theory, and it is in these two directions that our analysis applies. 

In the second section of the paper, we study the minimum problem (2) by 
the method of interior variation (1). This leads in a natural way to an integral 
H for which we can derive a second order partial differential equation from the 
existence of the first derivatives of y. In the case of the Plateau problem, H 
satisfies an elliptic Monge-Ampére equation which ties in with Rad6’s proof 
of the analyticity of a minimal surface. 

As a preparation for the application of interior variations, we take up in the 
third section a construction based on symmetrization which yields for the 
Plateau problem a minimal sequence satisfying a uniform Lipschitz condition. 
For this construction, our assumption on the boundary data is more general 
than that usually required for the analogous conclusion using the three-point 
condition, and we are therefore able to discuss the Plateau problem in non- 
parametric form for a domain which need not be convex. Furthermore, the 
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symmetrization method applies equally well for any finite number of inde- 
pendent variables. 


2. Interior variation and the integral H7. In a plane domain D with boun- 
dary curves C, let ¥ be a solution of the minimum problem (2), with, for ex- 
ample, prescribed boundary values on the curves C. We shall assume merely 
that y satisfies a Lipschitz condition in D, so that the first derivatives ¥, and 
¥, exist almost everywhere and are bounded. Instead of trying to derive the 
Euler-Lagrange equation (1) by the classical method of varying ¥, we study 
(2) in this section by performing infinitesimal transformations on the indepen- 
dent variables x and y and by considering the shift of y thus generated. 

We let f be a continuous complex-valued function in D, with piece-wise 
continuous first derivatives, which vanishes on C. It is convenient to use the 
complex notation 2f, = f, — if,, 2fs = f. + if,, z = x + iy, for derivatives. 
For small values of the complex parameter e, the transformation 


(3) v=2+0, 2* = x* + iy", 


performs a one-to-one mapping of D onto itself. We define a new function ¥* 
in D by the formula 
(4) V* (x*, y*) = v(x, 9), 
and we compare the values of the integral in (2) associated with the two 
neighboring functions y and y*. 

Clearly 


(5) f f rays vt) ax” ay” 


D 


= J J rate. + oz Z.*|[v. 22% + vz %*)) tt dx dy, 
D 


O(x, y) 
whence by elementary calculation 


6) ff Faved vt) ae” dy" — ff Fay. v5) dx dy 


D F D 
=2 mY. ffe —qF’)f.—4F ¥:fi\ dx ay +0(|e|*), 


where g? = 4¥.4; and where F and F’ stand for F(g’) and F’(q’). In the usual 
fashion, we conclude from (6), from the extremal property (2) of ¥, and from 
the arbitrary nature of ¢ that 


7) J Jue - Ps. - 4FViflae dy =0 


for every continuous piece-wise continuously differentiable function f in D 
which vanishes on C. 
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Let @ be a closed subdomain of D, let w be a continuously differentiable 
function in D which is identically 1 in Q and which vanishes on C, let ¢ be an 
interior point of Q, and let p > 0 be so small that the circle |z — ¢| < p is 
contained in the interior of 2. We define 


z—i 
(8) = i5- 
p 
for |z — t| < p and we define 
Ww 
(9) f= “peng 


in D for |z — ¢| > p. We can substitute this function f into (7) to obtain 


(10) aS Srv dxdy = — Je - Ode dy + A(t), 


o-K 


where K denotes the circle |z — ¢| < p and where A(t) is the analytic function 


(1) a= ffe-er]:45,- tiple dy 


in Q. 
Letting p — 0, we find almost rere in Q 


(12) 4nF’ yi = - SJe IF ax dy + AW, 


where the integral on the right is to be interpreted in the sense of the Cauchy 
principal value. In the case of the Dirichlet problem, F = g? and the formula 
(12) shows immediately that y,? is an analytic function. Thus we obtain in 
the simplest situation, corresponding to an incompressible fluid, a quite elegant 
proof that the solution of the minimum problem (2) is a regular function. 

In order to study the general non-linear problem, we introduce the integral 


(13) H = —2 ff (— oF’) log |s — t\ de dy + B, 
Q 


where B is a real harmonic function in 2 such that rB,, = —A(t). By (12) and 
by standard lemmas on the second derivatives of a logarithmic potential, we 
find almost everywhere in Q 


(14) Hy =F-(qF, 

(15) Hy,= -—4F Vi, Hi = — 4F'yi. 
This gives in turn 

(16) H, Hy — Hig = 2° FF’ — F’. 


We can eliminate g from the equations (14) and (16) to obtain for the real 
function H a partial differential equation of the form 
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(17) AH = Q(H,.H,, — Hi,), MOH = His + Hy, 


where Q is a real analytic function of its real argument which is completely 
determined by F. For the most general function F corresponding to an arbi- 
trary equation of state, (17) is a non-linear equation equivalent to (1) which 
involves only very special combinations of second derivatives of the auxiliary 
function H. The significance of this second order partial differential equation 
is that its derivation requires only a Lipschitz condition on the stream func- 
tion y. 

The form of (17) suggests finding those integrands F for which it reduces to 
the Monge-Ampére equation 
(18) 4H = H,,H,, — Hey. 


This reduction takes place, according to (14) and (16), when F satisfies the 
ordinary differential equation 


(19) 2q°F F’ — F* = 2X(q’°F’ — F), 
for a suitable value of the constant A. We check immediately that (19) has the 
general solution 
(20) F=rA4+VV¥ +u9, 
whence the Monge-Ampére equation (18) is seen to correspond to the case in 
which (2) is the Plateau problem. 

We arrive in this way at a proof of the analyticity of a minimal surface. For 
we can apply the Legendre transformation 


(21) h+H=ux+ovy, u=H, v=H, 
to (18) in order to obtain the linear elliptic equation 
(22) Nun + hoo = 1. 


The solution h of (22) is evidently an analytic function of u and v, whence H 
is an analytic function of x and y, and by (14) and (15) we can conclude that 
the function y is also analytic in x and y. The Poisson equation (22) yields, 
furthermore, a procedure for constructing flow patterns explicitly according to 
the formulas of this section in the case where the equation of state of the fluid 
leads to an integrand of the type (20). 


3. Symmetrization and the Lipschitz condition. The analysis of the pre- 
ceding section exploited formal manipulations in order to demonstrate, in 
certain important special cases, the analyticity of solutions of (2). In this 
section, we complete our discussion of the Plateau problem by constructing a 
minimal sequence which fulfills a uniform Lipschitz condition. 

Along the curves C bounding the domain D, we assign boundary values which 
generate in space a system of smooth closed curves I’. The problem is to span 
through the curves [ a non-parametric minimal surface over the domain D. 
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For this problem of the type (2), we can clearly find a minimal sequence of 
surfaces whose areas approach the minimum value in question. There is no loss 
of generality if we assume that each of these surfaces lies in the convex hull of 
the system of curves I’, since, when this is not true for a given surface, we can 
diminish the area of the surface by replacing portions of it by sections of planes 
tangent to the convex hull of I’. Furthermore, it is permissible to diminish the 
area of any of the surfaces of the minimal sequence in a similar manner by 
replacing portions of them by simply-connected sections of catenoids, or of 
other specific known minimal surfaces, which do not intersect . Thus we may 
suppose that all members of our minimal sequence lie in the largest closed region 
E projecting onto D which contains the curves I’, but which does not intersect 
in a simply-connected surface element any plane or catenoid not meeting I. 
This latter condition means that EZ cannot be diminished infinitesimally by 
cutting off a volume element with a plane or catenoid which does not intersect 
i 

We now make the assumption on the boundary curves I that the closed set 
E is so situated that, for some @> 0, all lines making an angle smaller than 6 
with the normal to the plane domain D and intersecting [ have only one 
point in common with E. This assumption restricts the curvature of I’, but 
does not imply that D is a convex domain, since in some cases D can even be 
multiply-connected. 

We consider any rectangular coordinate system in which the z-axis makes an 
angle less than @ with the normal to the plane of D. In this coordinate system, 
any element of our minimal sequence has a non-parametric representation 
z = 2(x, y) which may be multiple-valued, with branches 


Z = 2,(X, y), Z = 2o(X, y), ~~.) SZ = Somgi(X, Y), 21K 22K... K Somes. 


We symmetrize such a surface by replacing it with the surface whose non- 
parametric representation is the single-valued expression 


2m+1 


(23) z= >) (-1)**"' x(x, 9). 


k=l 


It follows from the basic results of Steiner, or, more directly, from Minkowski's 
inequality (3), that the symmetrization process (23) diminishes or leaves 
unchanged the area of the surface. Furthermore, if symmetrization in one such 
coordinate system is followed by symmetrization in another, the resulting 
surface still has a single-valued non-parametric representation in the first 
coordinate system. This can be checked by making an affine transformation 
such that the directions of the two symmetrizations become perpendicular, 
a case in which the result is evident. Finally, and most important of all, the 
symmetrization procedure (23) does not alter the boundary curves I of any of 
our surfaces, since the surfaces lie in the set Z, which has the property that if 
any line parallel to one of our z-axes intersects I’, then it intersects T and EZ in 
only one single point. 
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These considerations show that we can assume without loss of generality 
that each surface of our minimal sequence has a single-valued non-parametric 
representation in every coordinate system whose z-axis is inclined at an angle 
less than 6 with the normal to the plane of D. But if we denote by z = y(x, y) 
the representation of such a surface in a coordinate system such that D lies in 
the (x, y)-plane, then this result implies that y satisfies the Lipschitz condition 


(24) lp (xe, Yo) — ¥(x1, ¥1)| < M[(x2 — x1)? + (2 — y1)?}}, 


with M = cot 6. Thus we obtain a minimal sequence of surfaces satisfying the 
Lipschitz condition (24), and from the lower semi-continuity of the area 
integral (2) we can deduce the existence of a solution of (2) satisfying the same 
Lipschitz condition, for F = (1 + q?)}. 

A combination of the techniques of this section and of the previous section 
yields a solution of the Plateau problem in non-parametric form for a domain 
D which is not necessarily convex, but for a system of smooth boundary 
curves I satisfying the above geometrical condition relative to certain planes, 
catenoids, and projections. Our method for developing such a condition is a 
generalization to non-linear elliptic equations of the majorization principles 
of the theory of linear elliptic partial differential equations based on the 
maximum principle. 

A point of interest in the symmetrization construction presented in this 
section is that it yields estimates of precisely the same nature in space of any 
number of dimensions. In particular, we can treat the variational problem of 
minimizing 


(25) fffo + ¥i+ vi + vi)! dx dy dz, 


which has applications to the study of three-dimensional flow of a Karman- 
Tsien gas. 
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CORRECTION TO 
‘**NULL TRIGONOMETRIC SERIES IN 
DIFFERENTIAL EQUATIONS” 


CHARLES WALMSLEY 


My paper, Null trigonometric series in differential equations, in this Journal, 
§ (1953) 536-543, contains an error which | wish to correct. 

The correction is to add to the statement of the theorem on page 541 the 
condition: 


(iii) the integrated series Zn~0 Yn e”"/n be summable (C, k — 1) or convergent, 
uniformly as in (ii). 


No change is needed in the remarks on page 542 as to the scope of the 
theorem. In case (a), assuming that the function f(x) is represented by a 
(Lebesgue) Fourier series, a standard theorem (4, p. 30; 6, p. 340) ensures that 
the integrated series is uniformly convergent. In case (b) the series arising 
from meromorphic functions necessarily have summable conjugates and corre- 
spondingly summable integrated series. In these cases condition (iii) is redun- 
dant. 

The error arose in the discussion of §3 (p. 538) and in the proof of the 
theorem of §4 (p. 542). Using 2’ to denote summation from 1 to ~, the known 
theorem on convergence factors (2, Theorem 76) states that if 2’u, is sum- 
mable (C, k) then 2’u,/n* is summable (C, k — s) if 0 < s < k +1. This is 
applied with s = 1,2,...,m to the trigonometric series which, expressed in 
conventional two-way form, is w,, actually representing 2’ (w, + w_,) where 


Wy, = (in)™ c, ce. 


If siseven 2w,/n* represents 2’ (w, + w_,)/n'*, but if s is odd it represents the 
conjugate series 2’(w, — w_,)/n*. The error lies in ignoring this distinction 
between the odd and even cases. 

The most direct way to validate the argument would be to assume condi- 
tions ensuring the summability (C,k) of the conjugate series Dw’, where 
w’, = w, sgn m. The theorem could then be applied to this conjugate series for 
odd values of s and to the original series for even values. But since the series 
used are Lw,, Dw’,/n, etc., a better way is to assume conditions for the 
(C, k — 1) summability of 2w’,/n, which is the integrated series of Dw,. 

In the theorem of §4 the coefficient c, is given by 4.2, 4.3 and consists of two 
parts, arising respectively from the null series and from the series for f(x). 
Write 


w, = (4n)"c,e™* = rh, + te. 
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The series 2A, arising from the null series, and its conjugate, are necessarily 
summable as desired. The desired summability of Zu, and Zy, sgn n/n is 
ensured if we assume the (C, k — 1) summability of the integrated series in 
addition to condition (ii). The extra assumption is condition (iii) above. 
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. THE THEORY OF POTENTIAL AND SPHERICAL 
HARMONICS 
By Wo Lcanc J. STERNBERG AND TURNER L. SmirH. $5.50 
. THE VARIATIONAL PRINCIPLES OF MECHANICS 
By Cornetius Lanczos. $5.75 
. TENSOR CALCULUS 
By J. L. Synce anp A. Scuitp. $6.50 
. THE THEORY OF FUNCTIONS OF A REAL VARIABLE 
By R. L. Jerrery. $6.00 
. GENERAL TOPOLOGY 
By W. Sierpinski. Translated and revised by C. C. Kriecen. 
$7.50 
. BERNSTEIN POLYNOMIALS 
By G. Lorentz. $5.75 
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MACMILLAN 
and the Calculus 


Already on our list are nineteen calculus texts at the college 
level. Without apology, we announce the publication of one more 
noteworthy series: 


AN ANALYTICAL CALCULUS in Four Volumes 
by E. A. Maxwell 


VOLUME I 
The idea of differentiation 


The evaluation of differential coefficients 
Applications of differentiation 
The idea of integration 
Devices in integration 
Applications of integration 
Published $2.50 
VOLUME II 
The logarithmic and exponential functions 
Taylor's series and allied results 
The hyperbolic functions 
Curves 
Complex numbers 
Systematic integration 
Integrals involving infinity 
Published $3.00 


VOLUMES III and IV now in preparation 


Each section gives examples, and each chapter includes problems. Mr. 
Maxwell’s previous books, GENERAL HOMOGENEOUS COORDINATES IN SPACE 
OF THREE DIMENSIONS ($3.00) and METHODS OF PLANE PROJECTIVE 
GEOMETRY, BASED ON THE USE OF GENERAL HOMOGENEOUS COORDINATES 
($3.00) are already well known and respected by university teachers. 
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70 Bond Street Toronto 2 

















